Exam DP-100 All Questions

View all questions & answers for the DP-100 exam

Go to Exam

Exam DP-100 topic 1 question 21 discussion

Actual exam question from Microsoft's DP-100

Question #: 21
Topic #: 1

[All DP-100 Questions]

HOTSPOT -
Complete the sentence by selecting the correct option in the answer area.
Hot Area:

Show Suggested Answer

Suggested Answer:

Replace using Probabilistic PCA: Compared to other options, such as Multiple Imputation using Chained Equations (MICE), this option has the advantage of not requiring the application of predictors for each column. Instead, it approximates the covariance for the full dataset. Therefore, it might offer better performance for datasets that have missing values in many columns.
Reference:
https://docs.microsoft.com/en-us/azure/machine-learning/studio-module-reference/clean-missing-data

by ranjsi01 at Jan. 30, 2022, 4 p.m.

Comments

Submit Cancel

pancman

Highly Voted 3 years ago

I don't think that this is a real exam question. Median and custom substitution techniques don't require a predictor either.

upvoted 19 times

...

rishi_ram

Highly Voted 1 year, 11 months ago

Replace using Probabilistic PCA: Replaces the missing values by using a linear model that analyzes the correlations between the columns and estimates a low-dimensional approximation of the data, from which the full data is reconstructed. The underlying dimensionality reduction is a probabilistic form of Principal Component Analysis (PCA), and it implements a variant of the model proposed in the Journal of the Royal Statistical Society, Series B 21(3), 611–622 by Tipping and Bishop. Compared to other options, such as Multiple Imputation using Chained Equations (MICE), this option has the advantage of not requiring the application of predictors for each column. Instead, it approximates the covariance for the full dataset. Therefore, it might offer better performance for datasets that have missing values in many columns. https://learn.microsoft.com/en-us/previous-versions/azure/machine-learning/studio-module-reference/clean-missing-data

upvoted 5 times

...

geethavkr

Most Recent 8 months, 2 weeks ago

correct.. Outdated but in previous versions it says Replace using Probabilistic PCA: Replaces the missing values by using a linear model that analyzes the correlations between the columns and estimates a low-dimensional approximation of the data, from which the full data is reconstructed. The underlying dimensionality reduction is a probabilistic form of Principal Component Analysis (PCA), and it implements a variant of the model proposed in the Journal of the Royal Statistical Society, Series B 21(3), 611–622 by Tipping and Bishop. Compared to other options, such as Multiple Imputation using Chained Equations (MICE), this option has the advantage of not requiring the application of predictors for each column. Instead, it approximates the covariance for the full dataset. Therefore, it might offer better performance for datasets that have missing values in many columns.

upvoted 1 times

...

kay1101

11 months ago

I think this is an outdated question. as of may 2024, PCA is no longer in the clean missing data module. reference: https://learn.microsoft.com/en-us/azure/machine-learning/component-reference/clean-missing-data?view=azureml-api-2 however, in the past, PCA did in the clean missing data module. reference:https://learn.microsoft.com/en-us/previous-versions/azure/machine-learning/studio-module-reference/clean-missing-data at the time of the question was created, PCA may be correct. but now, i thick is either median or custom substitution value.

upvoted 1 times

...

InversaRadice

1 year, 4 months ago

answer is 100% correct ... Replace using Probabilistic PCA: ... Compared to other options, such as Multiple Imputation using Chained Equations (MICE), this option has the advantage of not requiring the application of predictors for each column. https://learn.microsoft.com/en-us/previous-versions/azure/machine-learning/studio-module-reference/clean-missing-data

upvoted 2 times

...

eternaleclipse

1 year, 6 months ago

What pancman said. outdated question

upvoted 1 times

...

IvanTT

1 year, 6 months ago

It can't be "A. Probabilistic PCA" because it isn't an option for the Clean Missing Data module. Here is the reference: https://learn.microsoft.com/en-us/azure/machine-learning/component-reference/clean-missing-data?view=azureml-api-2 It could be "D. Custom Substitution Value". The option "B. Median" isn't the exact option for the module which it's "Replace with median".

upvoted 1 times

...

james2033

1 year, 6 months ago

Qutote "Replace using Probabilistic PCA: Replaces the missing values by using a linear model that analyzes the correlations between the columns and estimates a low-dimensional approximation of the data, from which the full data is reconstructed. The underlying dimensionality reduction is a probabilistic form of Principal Component Analysis (PCA), and it implements a variant of the model proposed in the Journal of the Royal Statistical Society, Series B 21(3), 611–622 by Tipping and Bishop. Compared to other options, such as Multiple Imputation using Chained Equations (MICE), this option has the advantage of not requiring the application of predictors for each column." Reference https://learn.microsoft.com/en-us/previous-versions/azure/machine-learning/studio-module-reference/clean-missing-data#:~:text=this%20option%20has%20the%20advantage%20of%20not%20requiring%20the%20application%20of%20predictors%20for%20each%20column.

upvoted 1 times

...

rakeshmk

1 year, 7 months ago

PCA is a dimensionality reduction technique.. Median can be the answer

upvoted 3 times

...

PradhanManva

1 year, 7 months ago

PCA -This is the answer.

upvoted 1 times

...

MarinaMijailovic

2 years ago

Correct answer is medain - it only calulates the medain from the given column, no other columns required pca - needs predictors to calculate the probabilities smote - needs predictors to generate synthetic samples for the minority class csv - doesn't really need predictors per se, but still requires some knoweldge about the data to pick the right value

upvoted 3 times

...

Truman

2 years ago

One data cleaning option that does not require predictors for each column in the Clean Missing Data module is the "Replace with mean" option. This option replaces missing values in a column with the mean of the available values in that column All these options are false

upvoted 1 times

...

Vic9

2 years ago

A https://learn.microsoft.com/en-us/previous-versions/azure/machine-learning/studio-module-reference/clean-missing-data "Replace using Probabilistic PCA: Replaces the missing values by using a linear model that analyzes the correlations between the columns and estimates a low-dimensional approximation of the data, from which the full data is reconstructed. The underlying dimensionality reduction is a probabilistic form of Principal Component Analysis (PCA), and it implements a variant of the model proposed in the Journal of the Royal Statistical Society, Series B 21(3), 611–622 by Tipping and Bishop. Compared to other options, such as Multiple Imputation using Chained Equations (MICE), this option has the advantage of not requiring the application of predictors for each column. Instead, it approximates the covariance for the full dataset. Therefore, it might offer better performance for datasets that have missing values in many columns."

upvoted 2 times

...

phdykd

2 years, 2 months ago

A) Probabilistic PCA and C) SMOTE are not data cleaning options in the clean missing data module. Probabilistic PCA is a technique used for dimensionality reduction and feature extraction in machine learning, and it is not specifically designed to handle missing data. SMOTE (Synthetic Minority Over-sampling Technique) is a technique used for dealing with imbalanced datasets in machine learning, and it is not designed to handle missing data. Therefore, the correct answer to the question "..... is a data cleaning option of the clean missing data module that does not require predictors for each column" is either B) Median or D) Custom substitution value.

upvoted 2 times

...

Peeking

2 years, 2 months ago

PCA is wrong.

upvoted 2 times

...

ranjsi01

3 years, 2 months ago

correct

upvoted 3 times

...

Exam DP-100 All Questions

View all questions & answers for the DP-100 exam

Exam DP-100 topic 1 question 21 discussion

Comments

pancman

rishi_ram

geethavkr

kay1101

InversaRadice

eternaleclipse

IvanTT

james2033

rakeshmk

PradhanManva

MarinaMijailovic

Truman

Vic9

phdykd

Peeking

ranjsi01

SY0-701