exam questions

Exam Professional Machine Learning Engineer All Questions

View all questions & answers for the Professional Machine Learning Engineer exam

Exam Professional Machine Learning Engineer topic 1 question 150 discussion

Actual exam question from Google's Professional Machine Learning Engineer
Question #: 150
Topic #: 1
[All Professional Machine Learning Engineer Questions]

You are working on a binary classification ML algorithm that detects whether an image of a classified scanned document contains a company’s logo. In the dataset, 96% of examples don’t have the logo, so the dataset is very skewed. Which metric would give you the most confidence in your model?

  • A. Precision
  • B. Recall
  • C. RMSE
  • D. F1 score
Show Suggested Answer Hide Answer
Suggested Answer: D 🗳️

Comments

Chosen Answer:
This is a voting comment (?). It is better to Upvote an existing comment if you don't have anything to add.
Switch to a voting comment New
PST21
Highly Voted 1 year ago
B. Recall In a highly imbalanced dataset like the one described (96% of examples are in the negative class), the metric that would give the most confidence in the model's performance is recall. Recall (also known as sensitivity or true positive rate) is the proportion of actual positive cases that were correctly identified by the model. In this context, it means the percentage of images containing the company's logo that the model correctly classified as positive out of all the actual positive cases. Since the dataset is heavily skewed, a high recall value would indicate that the model is effectively capturing the positive cases (images with the logo) despite the class imbalance. F1 score (D) is a balance between precision and recall and is a useful metric for imbalanced datasets. However, in this specific case, recall is more important because we want to be confident in detecting the logo images, even if it comes at the cost of having some false positives (lower precision).
upvoted 8 times
vale_76_na_xxx
7 months, 3 weeks ago
I go for B as well
upvoted 1 times
...
...
fitri001
Highly Voted 3 months, 2 weeks ago
Selected Answer: D
Precision vs. Recall: Precision focuses on the percentage of predicted positive cases (logo present) that are actually correct. Recall emphasizes the model's ability to identify all actual positive cases (correctly identifying all logos). In a highly imbalanced dataset, a naive model could simply predict "no logo" for every image and achieve very high accuracy (almost 96%!). However, this wouldn't be a useful model since it would miss all the actual logos (low recall). F1 Score: The F1 score strikes a balance between precision and recall. It takes the harmonic mean of these two metrics, giving a more comprehensive picture of the model's performance in both identifying logos (recall) and avoiding false positives (precision).
upvoted 7 times
AzureDP900
2 months, 2 weeks ago
very well explained!
upvoted 1 times
...
...
8619d79
Most Recent 3 days, 3 hours ago
Selected Answer: B
The focus here is on detecting images with logo (that is the minority class). Recall is the right metric. If the question would have highlighted that also the detection of images without logo is important I would have voted for D. But here is not the case. And of course F1 avoids that the model always mark as "with logo", but this is more on how the metric is used and interpreted. Otherwise when should be Recall useful?
upvoted 1 times
...
gscharly
3 months, 3 weeks ago
Selected Answer: D
Went with D
upvoted 1 times
...
Yan_X
4 months, 1 week ago
Selected Answer: B
B See #90, should be F score with Recall weights more than Precision.
upvoted 3 times
...
CHARLIE2108
4 months, 3 weeks ago
Selected Answer: B
I went with B.
upvoted 1 times
...
vaibavi
5 months, 4 weeks ago
Selected Answer: D
F1 score provides a comprehensive evaluation by penalizing models that excel in just one aspect at the expense of the other. By considering both precision and recall, it helps identify models that effectively balance true positive identification with minimal false positives, making it a more suitable metric for imbalanced data like your logo detection problem.
upvoted 2 times
...
M25
1 year, 2 months ago
Selected Answer: D
See #90!
upvoted 2 times
...
FherRO
1 year, 5 months ago
Selected Answer: D
F1 score works well for imbalanced data sets
upvoted 1 times
...
TNT87
1 year, 5 months ago
Selected Answer: D
https://stephenallwright.com/imbalanced-data-metric/
upvoted 2 times
...
Community vote distribution
A (35%)
C (25%)
B (20%)
Other
Most Voted
A voting comment increases the vote count for the chosen answer by one.

Upvoting a comment with a selected answer will also increase the vote count towards that answer by one. So if you see a comment that you already agree with, you can upvote it instead of posting a new comment.

SaveCancel
Loading ...
exam
Someone Bought Contributor Access for:
SY0-701
London, 1 minute ago