exam questions

Exam AWS Certified Machine Learning - Specialty All Questions

View all questions & answers for the AWS Certified Machine Learning - Specialty exam

Exam AWS Certified Machine Learning - Specialty topic 1 question 152 discussion

A machine learning (ML) specialist must develop a classification model for a financial services company. A domain expert provides the dataset, which is tabular with 10,000 rows and 1,020 features. During exploratory data analysis, the specialist finds no missing values and a small percentage of duplicate rows. There are correlation scores of > 0.9 for 200 feature pairs. The mean value of each feature is similar to its 50th percentile.
Which feature engineering strategy should the ML specialist use with Amazon SageMaker?

  • A. Apply dimensionality reduction by using the principal component analysis (PCA) algorithm.
  • B. Drop the features with low correlation scores by using a Jupyter notebook.
  • C. Apply anomaly detection by using the Random Cut Forest (RCF) algorithm.
  • D. Concatenate the features with high correlation scores by using a Jupyter notebook.
Show Suggested Answer Hide Answer
Suggested Answer: A 🗳️

Comments

Chosen Answer:
This is a voting comment (?). It is better to Upvote an existing comment if you don't have anything to add.
Switch to a voting comment New
ovokpus
Highly Voted 1 year, 10 months ago
Selected Answer: A
Dimensions are too high. Use PCA
upvoted 10 times
...
LydiaGom
Highly Voted 1 year, 11 months ago
A should be the answer to avoid the curse of dimensionality
upvoted 7 times
...
chet100
Most Recent 8 months ago
Easy choice. Always choose PCA for dim reduction
upvoted 2 times
...
Mickey321
9 months ago
Selected Answer: A
the best feature engineering strategy for the ML specialist to use with Amazon SageMaker is to apply dimensionality reduction by using the PCA algorithm.
upvoted 3 times
...
Gaby999
1 year ago
Selected Answer: A Given that the dataset has 1,020 features and 200 of them are highly correlated, it is likely that the dataset suffers from multicollinearity. In such cases, dimensionality reduction techniques like principal component analysis (PCA) can be used to transform the data into a lower dimensional space without losing much information. Therefore, option A, "Apply dimensionality reduction by using the principal component analysis (PCA) algorithm" is the most appropriate feature engineering strategy for the ML specialist to use with Amazon SageMaker. This would help reduce the computational complexity of the model, improve model performance, and help to avoid overfitting.
upvoted 1 times
...
AjoseO
1 year, 2 months ago
Selected Answer: A
A. Apply dimensionality reduction by using the principal component analysis (PCA) algorithm. Since the dataset has many features, and a significant number of them have high correlation scores, the model may suffer from the curse of dimensionality. To reduce the dimensionality of the dataset, the specialist can use a technique like PCA, which reduces the number of features while still retaining the maximum amount of information. PCA can help remove redundant features and improve the model's performance by reducing the chances of overfitting. Additionally, since there are no missing values and a small percentage of duplicate rows, no data cleaning techniques like anomaly detection or dropping the features are required. Concatenating features with high correlation scores is not an appropriate strategy since it may lead to collinearity issues.
upvoted 1 times
...
drcok87
1 year, 2 months ago
A PCA: PCA is a linear dimensionality reduction technique (algorithm) that transforms a set of correlated variables (p) into a smaller k (k<p) number of uncorrelated variables called principal components while retaining as much of the variation in the original dataset as possible
upvoted 1 times
...
Peeking
1 year, 4 months ago
Selected Answer: A
Choosing C is answer by ExamTopics is completely laughable.
upvoted 1 times
...
DJiang
1 year, 11 months ago
Selected Answer: A
I think it's A.
upvoted 4 times
...
Community vote distribution
A (35%)
C (25%)
B (20%)
Other
Most Voted
A voting comment increases the vote count for the chosen answer by one.

Upvoting a comment with a selected answer will also increase the vote count towards that answer by one. So if you see a comment that you already agree with, you can upvote it instead of posting a new comment.

SaveCancel
Loading ...
exam
Someone Bought Contributor Access for:
SY0-701
London, 1 minute ago