While building a predictive model, median imputations are performed while preparing the training data. How should the imputations be addressed in the validation data?
A.
The imputed values are irrelevant to the validation data, and are not used.
B.
The imputed values must be applied directly to the validation data without recalculation.
C.
The imputed values must be recalculated using the validation data.
D.
The imputed values must be recalculated using both the training and the validation data.
Correct answer is B. Medians should come from the training data set. This is addressed in SAS' course video Predictive Modeling Using Logistic Regression (15.1) Lesson 4.2 - "...So in the validation data set, missing values should be replaced with the medians from the training data set. "
Furthermore, in the quiz at the end of the SAS lesson, one question states a particular answer was incorrect because "Answer b is incorrect because the missing values in the validation data set need to be replaced with the medians from the training data set."
Please fix this for others.
upvoted 4 times
...
Log in to ExamTopics
Sign in:
Community vote distribution
A (35%)
C (25%)
B (20%)
Other
Most Voted
A voting comment increases the vote count for the chosen answer by one.
Upvoting a comment with a selected answer will also increase the vote count towards that answer by one.
So if you see a comment that you already agree with, you can upvote it instead of posting a new comment.
chuck
3 years, 8 months agoUnorich
4 years, 1 month ago