Exam AWS Certified Machine Learning Engineer - Associate MLA-C01 topic 1 question 13 discussion

Exam question from Amazon's AWS Certified Machine Learning Engineer - Associate MLA-C01

Question #: 13
Topic #: 1

[All AWS Certified Machine Learning Engineer - Associate MLA-C01 Questions]

Case study -
An ML engineer is developing a fraud detection model on AWS. The training dataset includes transaction logs, customer profiles, and tables from an on-premises MySQL database. The transaction logs and customer profiles are stored in Amazon S3.
The dataset has a class imbalance that affects the learning of the model's algorithm. Additionally, many of the features have interdependencies. The algorithm is not capturing all the desired underlying patterns in the data.
Before the ML engineer trains the model, the ML engineer must resolve the issue of the imbalanced data.
Which solution will meet this requirement with the LEAST operational effort?

A. Use Amazon Athena to identify patterns that contribute to the imbalance. Adjust the dataset accordingly.
B. Use Amazon SageMaker Studio Classic built-in algorithms to process the imbalanced dataset.
C. Use AWS Glue DataBrew built-in features to oversample the minority class.
D. Use the Amazon SageMaker Data Wrangler balance data operation to oversample the minority class.

Show Suggested Answer

Suggested Answer: D 🗳️

by GiorgioGss at Nov. 27, 2024, 3:44 p.m.

Disclaimers:

- ExamTopics website is not related to, affiliated with, endorsed or authorized by Amazon.
- Trademarks, certification & product names are used for reference only and belong to Amazon.

Comments

Submit Cancel

GiorgioGss

Highly Voted 4 months, 3 weeks ago

Selected Answer: D

LEAST effort https://aws.amazon.com/blogs/machine-learning/balance-your-data-for-machine-learning-with-amazon-sagemaker-data-wrangler/

upvoted 6 times

...

Sadrik

Most Recent 3 weeks, 1 day ago

Selected Answer: D

SageMaker Data Wrangler provides a "balance data" operation designed to handle class imbalance.

upvoted 1 times

...

ninomfr64

3 months, 2 weeks ago

Selected Answer: D

Both Glue DataBrew and Data Wrangler allows data preparation for ML with no-code/low-code (aka low ops effort). However, Data Wrangler provides built-in transformation for balancing dataset (random oversampling, random undersampling and smote) https://docs.aws.amazon.com/sagemaker/latest/dg/data-wrangler-transform.html#data-wrangler-transform-balance-data while DataBrew doesn't provide built-in recipe step for balancing dataset, actually it provides a smaller set of data science recipe steps limited to binarization, bucketization, categorical mapping, one-hot encoding, scaling, skewness and tokenization https://docs.aws.amazon.com/databrew/latest/dg/recipe-actions.data-science.html

upvoted 1 times

...

Exam AWS Certified Machine Learning Engineer - Associate MLA-C01 All Questions

View all questions & answers for the AWS Certified Machine Learning Engineer - Associate MLA-C01 exam

Exam AWS Certified Machine Learning Engineer - Associate MLA-C01 topic 1 question 13 discussion

Comments

GiorgioGss

Sadrik

ninomfr64

SY0-701