exam questions

Exam Certified Generative AI Engineer Associate All Questions

View all questions & answers for the Certified Generative AI Engineer Associate exam

Exam Certified Generative AI Engineer Associate topic 1 question 35 discussion

Actual exam question from Databricks's Certified Generative AI Engineer Associate
Question #: 35
Topic #: 1
[All Certified Generative AI Engineer Associate Questions]

A Generative AI Engineer has created a RAG application which can help employees retrieve answers from an internal knowledge base, such as Confluence pages or Google Drive. The prototype application is now working with some positive feedback from internal company testers. Now the Generative Al Engineer wants to formally evaluate the system’s performance and understand where to focus their efforts to further improve the system.
How should the Generative AI Engineer evaluate the system?

  • A. Use cosine similarity score to comprehensively evaluate the quality of the final generated answers.
  • B. Curate a dataset that can test the retrieval and generation components of the system separately. Use MLflow’s built in evaluation metrics to perform the evaluation on the retrieval and generation components.
  • C. Benchmark multiple LLMs with the same data and pick the best LLM for the job.
  • D. Use an LLM-as-a-judge to evaluate the quality of the final answers generated.
Show Suggested Answer Hide Answer
Suggested Answer: B 🗳️

Comments

Chosen Answer:
This is a voting comment (?). It is better to Upvote an existing comment if you don't have anything to add.
Switch to a voting comment New
fa2bede
2 months, 2 weeks ago
Selected Answer: D
Using LLMs-as-a-judge for our document-based chatbot evaluation is as effective as human judges, according to databricks blogs. option B, references ML flow model evaluation metrics which does not work well for RAG/GenAI model evaluations
upvoted 1 times
fa2bede
2 months, 2 weeks ago
I stand corrected, MLflow supports llm as a judge and other evaluation metrics suited for LLM evaluation. Hence Option B is correct
upvoted 1 times
...
...
trendy01
3 months, 2 weeks ago
Selected Answer: B
B appears to be the most appropriate choice. By evaluating the search and generation components separately, you can gain a clearer understanding of your system's performance and effectively identify areas for improvement.
upvoted 2 times
...
Community vote distribution
A (35%)
C (25%)
B (20%)
Other
Most Voted
A voting comment increases the vote count for the chosen answer by one.

Upvoting a comment with a selected answer will also increase the vote count towards that answer by one. So if you see a comment that you already agree with, you can upvote it instead of posting a new comment.

SaveCancel
Loading ...
exam
Someone Bought Contributor Access for:
SY0-701
London, 1 minute ago