exam questions

Exam Certified Machine Learning Associate All Questions

View all questions & answers for the Certified Machine Learning Associate exam

Exam Certified Machine Learning Associate topic 1 question 46 discussion

Actual exam question from Databricks's Certified Machine Learning Associate
Question #: 46
Topic #: 1
[All Certified Machine Learning Associate Questions]

A data scientist wants to use Spark ML to impute missing values in their PySpark DataFrame features_df. They want to replace missing values in all numeric columns in features_df with each respective numeric column’s median value.
They have developed the following code block to accomplish this task:

The code block is not accomplishing the task.
Which reasons describes why the code block is not accomplishing the imputation task?

  • A. It does not impute both the training and test data sets.
  • B. The inputCols and outputCols need to be exactly the same.
  • C. The fit method needs to be called instead of transform.
  • D. It does not fit the imputer on the data to create an ImputerModel.
Show Suggested Answer Hide Answer
Suggested Answer: D 🗳️

Comments

Chosen Answer:
This is a voting comment (?). It is better to Upvote an existing comment if you don't have anything to add.
Switch to a voting comment New
oliver29
1 month, 3 weeks ago
Selected Answer: C
The transform method cannot be used directly on the Imputer. The fit method must first be called to compute the median values for the columns. Only the resulting ImputerModel can apply the transformation.
upvoted 2 times
...
Community vote distribution
A (35%)
C (25%)
B (20%)
Other
Most Voted
A voting comment increases the vote count for the chosen answer by one.

Upvoting a comment with a selected answer will also increase the vote count towards that answer by one. So if you see a comment that you already agree with, you can upvote it instead of posting a new comment.

SaveCancel
Loading ...
exam
Someone Bought Contributor Access for:
SY0-701
London, 1 minute ago