Welcome to ExamTopics
ExamTopics Logo
- Expert Verified, Online, Free.
exam questions

Exam Certified Machine Learning Associate All Questions

View all questions & answers for the Certified Machine Learning Associate exam

Exam Certified Machine Learning Associate topic 1 question 5 discussion

Actual exam question from Databricks's Certified Machine Learning Associate
Question #: 5
Topic #: 1
[All Certified Machine Learning Associate Questions]

A data scientist has replaced missing values in their feature set with each respective feature variable’s median value. A colleague suggests that the data scientist is throwing away valuable information by doing this.
Which of the following approaches can they take to include as much information as possible in the feature set?

  • A. Impute the missing values using each respective feature variable’s mean value instead of the median value
  • B. Refrain from imputing the missing values in favor of letting the machine learning algorithm determine how to handle them
  • C. Remove all feature variables that originally contained missing values from the feature set
  • D. Create a binary feature variable for each feature that contained missing values indicating whether each row’s value has been imputed
  • E. Create a constant feature variable for each feature that contained missing values indicating the percentage of rows from the feature that was originally missing
Show Suggested Answer Hide Answer
Suggested Answer: D 🗳️

Comments

Chosen Answer:
This is a voting comment (?) , you can switch to a simple comment.
Switch to a voting comment New
Deuterium44
2 weeks, 4 days ago
Selected Answer: D
correct answer is D
upvoted 1 times
...
8605246
5 months, 1 week ago
the answer is D Creating a binary feature variable (also known as a missing indicator) for each feature that contained missing values is a common technique to retain information about the missingness itself. This approach allows the model to potentially learn patterns related to the missingness of data, which can be informative. Benefits of Creating Binary Indicators Retains Information: By adding a binary indicator, you preserve the information about which values were originally missing. This can be useful if the fact that a value is missing carries predictive power. Improves Model Performance: In some cases, the pattern of missing data can be correlated with the target variable. Including this information can help improve the model's performance. Flexibility: This method allows you to impute missing values (e.g., with the median) while still providing the model with additional context about the data.
upvoted 2 times
...
Community vote distribution
A (35%)
C (25%)
B (20%)
Other
Most Voted
A voting comment increases the vote count for the chosen answer by one.

Upvoting a comment with a selected answer will also increase the vote count towards that answer by one. So if you see a comment that you already agree with, you can upvote it instead of posting a new comment.

SaveCancel
Loading ...