exam questions

Exam AWS Certified AI Practitioner AIF-C01 All Questions

View all questions & answers for the AWS Certified AI Practitioner AIF-C01 exam

Exam AWS Certified AI Practitioner AIF-C01 topic 1 question 96 discussion

A company has fine-tuned a large language model (LLM) to answer questions for a help desk. The company wants to determine if the fine-tuning has enhanced the model's accuracy.

Which metric should the company use for the evaluation?

  • A. Precision
  • B. Time to first token
  • C. F1 score
  • D. Word error rate
Show Suggested Answer Hide Answer
Suggested Answer: C 🗳️

Comments

Chosen Answer:
This is a voting comment (?). It is better to Upvote an existing comment if you don't have anything to add.
Switch to a voting comment New
Moon
1 month ago
Selected Answer: C
C: F1 score Explanation: The F1 score is a balanced metric that combines precision and recall to evaluate the accuracy of a model, particularly in scenarios like question-answering, where both correctness (precision) and completeness (recall) matter. The F1 score is particularly useful when there is an uneven distribution of classes or when the model's ability to retrieve relevant and accurate answers is being assessed.
upvoted 1 times
...
may2021_r
1 month, 1 week ago
Selected Answer: C
The correct answer is C. F1 score combines precision and recall, making it ideal for question-answering evaluation.
upvoted 1 times
...
aws_Tamilan
1 month, 1 week ago
Selected Answer: C
The F1 score provides a balanced evaluation of the model's ability to give both relevant and accurate answers, making it the most suitable metric for assessing the fine-tuned model’s performance in answering help desk questions.
upvoted 1 times
...
ap6491
1 month, 1 week ago
Selected Answer: C
F1 score is a metric that combines precision and recall to evaluate the balance between correctly identified outputs and missed or irrelevant outputs. It is particularly useful for tasks like question answering, where both accuracy and completeness are critical. In this help desk scenario, the F1 score helps assess whether the model consistently provides correct and relevant answers to user queries, reflecting the effectiveness of fine-tuning.
upvoted 1 times
...
Community vote distribution
A (35%)
C (25%)
B (20%)
Other
Most Voted
A voting comment increases the vote count for the chosen answer by one.

Upvoting a comment with a selected answer will also increase the vote count towards that answer by one. So if you see a comment that you already agree with, you can upvote it instead of posting a new comment.

SaveCancel
Loading ...
exam
Someone Bought Contributor Access for:
SY0-701
London, 1 minute ago