exam questions

Exam Certified Machine Learning Associate All Questions

View all questions & answers for the Certified Machine Learning Associate exam

Exam Certified Machine Learning Associate topic 1 question 8 discussion

Actual exam question from Databricks's Certified Machine Learning Associate
Question #: 8
Topic #: 1
[All Certified Machine Learning Associate Questions]

A data scientist has created two linear regression models. The first model uses price as a label variable and the second model uses log(price) as a label variable. When evaluating the RMSE of each model by comparing the label predictions to the actual price values, the data scientist notices that the RMSE for the second model is much larger than the RMSE of the first model.
Which of the following possible explanations for this difference is invalid?

  • A. The second model is much more accurate than the first model
  • B. The data scientist failed to exponentiate the predictions in the second model prior to computing the RMSE
  • C. The data scientist failed to take the log of the predictions in the first model prior to computing the RMSE
  • D. The first model is much more accurate than the second model
  • E. The RMSE is an invalid evaluation metric for regression problems
Show Suggested Answer Hide Answer
Suggested Answer: B 🗳️


Chosen Answer:
This is a voting comment (?). It is better to Upvote an existing comment if you don't have anything to add.
Switch to a voting comment New
1 week, 6 days ago
Selected Answer: D
I think the correct one should be D! As stated in the exercise prompt, in the first model I considered the RMSE to be the RMSE between the price (target variable) and the predictions, BUT the second to be the RMSE between the log(price) (target variable) and the predictions. In that case, you don't need to exponentiate anything!
upvoted 1 times
1 week, 6 days ago
Rethinking it for one minute, I was wrong. Because in the case I imagined, there should not be so much difference between the RMSEs
upvoted 1 times
2 months ago
Selected Answer: B
B is the right answer.
upvoted 1 times
3 months ago
A is the correct answer, A and D
upvoted 2 times
3 months ago
A and D are cannot be true in the same time, and the less the RMSD is, the more the model is accurate. Thus, as model 1 as a lower RMSD than model 2, the invalid proposition is A
upvoted 1 times
3 months, 2 weeks ago
. The second model is much more accurate than the first model This explanation seems contradictory because if the second model were more accurate, its RMSE would be expected to be smaller, not larger.
upvoted 1 times
3 months, 3 weeks ago
Selected Answer: B
Needs to bi expon due to the logarithmic transformation before
upvoted 1 times
4 months ago
The question is Which of the following possible explanations for this difference is INVALID? It would have to be E since RMSE is used for regression frequently. That cannot be the explanation.
upvoted 3 times
8 months, 1 week ago
Selected Answer: B
The second model uses log(price) as the label variable. If the data scientist directly computes the RMSE on the predicted log values without exponentiating them back to the original price scale, the errors will be much larger, leading to a higher RMSE.
upvoted 2 times
8 months, 3 weeks ago
Selected Answer: B
the current answer is wrong.. RMSE is used for regression all the time
upvoted 3 times
Community vote distribution
A (35%)
C (25%)
B (20%)
Most Voted
A voting comment increases the vote count for the chosen answer by one.

Upvoting a comment with a selected answer will also increase the vote count towards that answer by one. So if you see a comment that you already agree with, you can upvote it instead of posting a new comment.

Loading ...
Someone Bought Contributor Access for:
London, 1 minute ago