Exam AWS Certified AI Practitioner AIF-C01 topic 1 question 96 discussion

Exam question from Amazon's AWS Certified AI Practitioner AIF-C01

Question #: 96
Topic #: 1

[All AWS Certified AI Practitioner AIF-C01 Questions]

A company has fine-tuned a large language model (LLM) to answer questions for a help desk. The company wants to determine if the fine-tuning has enhanced the model's accuracy.

Which metric should the company use for the evaluation?

A. Precision
B. Time to first token
C. F1 score
D. Word error rate

Show Suggested Answer

Suggested Answer: C 🗳️

by ap6491 at Dec. 28, 2024, 2:20 a.m.

Disclaimers:

- ExamTopics website is not related to, affiliated with, endorsed or authorized by Amazon.
- Trademarks, certification & product names are used for reference only and belong to Amazon.

Comments

Submit Cancel

Jessiii

2 months, 2 weeks ago

Selected Answer: C

The F1 score is a widely used metric for evaluating the accuracy of a model, especially in classification tasks where there is an imbalance between precision and recall. It is the harmonic mean of precision and recall, providing a balanced measure of the model’s ability to correctly identify relevant information while minimizing false positives and false negatives. In the context of a help desk model, you want to measure both the precision (correctness of answers) and recall (how well the model retrieves the relevant information). The F1 score helps you achieve a balanced view of these two metrics, making it a good choice for evaluating model accuracy in a fine-tuned large language model (LLM) for answering questions.

upvoted 2 times

...

Moon

3 months, 3 weeks ago

Selected Answer: C

C: F1 score Explanation: The F1 score is a balanced metric that combines precision and recall to evaluate the accuracy of a model, particularly in scenarios like question-answering, where both correctness (precision) and completeness (recall) matter. The F1 score is particularly useful when there is an uneven distribution of classes or when the model's ability to retrieve relevant and accurate answers is being assessed.

upvoted 2 times

...

may2021_r

4 months ago

Selected Answer: C

The correct answer is C. F1 score combines precision and recall, making it ideal for question-answering evaluation.

upvoted 1 times

...

aws_Tamilan

4 months ago

Selected Answer: C

The F1 score provides a balanced evaluation of the model's ability to give both relevant and accurate answers, making it the most suitable metric for assessing the fine-tuned model’s performance in answering help desk questions.

upvoted 1 times

...

ap6491

4 months ago

Selected Answer: C

F1 score is a metric that combines precision and recall to evaluate the balance between correctly identified outputs and missed or irrelevant outputs. It is particularly useful for tasks like question answering, where both accuracy and completeness are critical. In this help desk scenario, the F1 score helps assess whether the model consistently provides correct and relevant answers to user queries, reflecting the effectiveness of fine-tuning.

upvoted 1 times

...

Exam AWS Certified AI Practitioner AIF-C01 All Questions

View all questions & answers for the AWS Certified AI Practitioner AIF-C01 exam

Exam AWS Certified AI Practitioner AIF-C01 topic 1 question 96 discussion

Comments

Jessiii

Moon

may2021_r

aws_Tamilan

ap6491

SY0-701