Exam Professional Machine Learning Engineer All Questions

View all questions & answers for the Professional Machine Learning Engineer exam

Exam Professional Machine Learning Engineer topic 1 question 150 discussion

Actual exam question from Google's Professional Machine Learning Engineer

Question #: 150
Topic #: 1

[All Professional Machine Learning Engineer Questions]

You are working on a binary classification ML algorithm that detects whether an image of a classified scanned document contains a company’s logo. In the dataset, 96% of examples don’t have the logo, so the dataset is very skewed. Which metric would give you the most confidence in your model?

A. Precision
B. Recall
C. RMSE
D. F1 score

Show Suggested Answer

Suggested Answer: D 🗳️

by TNT87 at Feb. 16, 2023, 9:10 a.m.

Comments

Submit Cancel

PST21

Highly Voted 1 year, 3 months ago

B. Recall In a highly imbalanced dataset like the one described (96% of examples are in the negative class), the metric that would give the most confidence in the model's performance is recall. Recall (also known as sensitivity or true positive rate) is the proportion of actual positive cases that were correctly identified by the model. In this context, it means the percentage of images containing the company's logo that the model correctly classified as positive out of all the actual positive cases. Since the dataset is heavily skewed, a high recall value would indicate that the model is effectively capturing the positive cases (images with the logo) despite the class imbalance. F1 score (D) is a balance between precision and recall and is a useful metric for imbalanced datasets. However, in this specific case, recall is more important because we want to be confident in detecting the logo images, even if it comes at the cost of having some false positives (lower precision).

upvoted 8 times

vale_76_na_xxx

10 months, 2 weeks ago

I go for B as well

upvoted 1 times

...

fitri001

Highly Voted 6 months, 1 week ago

Selected Answer: D

Precision vs. Recall: Precision focuses on the percentage of predicted positive cases (logo present) that are actually correct. Recall emphasizes the model's ability to identify all actual positive cases (correctly identifying all logos). In a highly imbalanced dataset, a naive model could simply predict "no logo" for every image and achieve very high accuracy (almost 96%!). However, this wouldn't be a useful model since it would miss all the actual logos (low recall). F1 Score: The F1 score strikes a balance between precision and recall. It takes the harmonic mean of these two metrics, giving a more comprehensive picture of the model's performance in both identifying logos (recall) and avoiding false positives (precision).

upvoted 7 times

AzureDP900

5 months, 1 week ago

very well explained!

upvoted 1 times

...

8619d79

Most Recent 2 months, 3 weeks ago

Selected Answer: B

The focus here is on detecting images with logo (that is the minority class). Recall is the right metric. If the question would have highlighted that also the detection of images without logo is important I would have voted for D. But here is not the case. And of course F1 avoids that the model always mark as "with logo", but this is more on how the metric is used and interpreted. Otherwise when should be Recall useful?

upvoted 1 times

...

gscharly

6 months, 2 weeks ago

Selected Answer: D

Went with D

upvoted 1 times

...

Yan_X

6 months, 4 weeks ago

Selected Answer: B

B See #90, should be F score with Recall weights more than Precision.

upvoted 3 times

...

CHARLIE2108

7 months, 2 weeks ago

Selected Answer: B

I went with B.

upvoted 1 times

...

vaibavi

8 months, 3 weeks ago

Selected Answer: D

F1 score provides a comprehensive evaluation by penalizing models that excel in just one aspect at the expense of the other. By considering both precision and recall, it helps identify models that effectively balance true positive identification with minimal false positives, making it a more suitable metric for imbalanced data like your logo detection problem.

upvoted 2 times

...

M25

1 year, 5 months ago

Selected Answer: D

See #90!

upvoted 2 times

...

FherRO

1 year, 8 months ago

Selected Answer: D

F1 score works well for imbalanced data sets

upvoted 1 times

...

TNT87

1 year, 8 months ago

Selected Answer: D

https://stephenallwright.com/imbalanced-data-metric/

upvoted 2 times

...

Exam Professional Machine Learning Engineer All Questions

View all questions & answers for the Professional Machine Learning Engineer exam

Exam Professional Machine Learning Engineer topic 1 question 150 discussion

Comments

PST21

vale_76_na_xxx

fitri001

AzureDP900

8619d79

gscharly

Yan_X

CHARLIE2108

vaibavi

M25

FherRO

TNT87

SY0-701