exam questions

Exam AWS Certified Machine Learning - Specialty All Questions

View all questions & answers for the AWS Certified Machine Learning - Specialty exam

Exam AWS Certified Machine Learning - Specialty topic 1 question 9 discussion

A Machine Learning Specialist is developing a custom video recommendation model for an application. The dataset used to train this model is very large with millions of data points and is hosted in an Amazon S3 bucket. The Specialist wants to avoid loading all of this data onto an Amazon SageMaker notebook instance because it would take hours to move and will exceed the attached 5 GB Amazon EBS volume on the notebook instance.
Which approach allows the Specialist to use all the data to train the model?

  • A. Load a smaller subset of the data into the SageMaker notebook and train locally. Confirm that the training code is executing and the model parameters seem reasonable. Initiate a SageMaker training job using the full dataset from the S3 bucket using Pipe input mode.
  • B. Launch an Amazon EC2 instance with an AWS Deep Learning AMI and attach the S3 bucket to the instance. Train on a small amount of the data to verify the training code and hyperparameters. Go back to Amazon SageMaker and train using the full dataset
  • C. Use AWS Glue to train a model using a small subset of the data to confirm that the data will be compatible with Amazon SageMaker. Initiate a SageMaker training job using the full dataset from the S3 bucket using Pipe input mode.
  • D. Load a smaller subset of the data into the SageMaker notebook and train locally. Confirm that the training code is executing and the model parameters seem reasonable. Launch an Amazon EC2 instance with an AWS Deep Learning AMI and attach the S3 bucket to train the full dataset.
Show Suggested Answer Hide Answer
Suggested Answer: A 🗳️

Comments

Chosen Answer:
This is a voting comment (?). It is better to Upvote an existing comment if you don't have anything to add.
Switch to a voting comment New
JayK
Highly Voted 3 years, 6 months ago
Answer is A. The answer to this question is about Pipe mode from S3. The only options are A and C. As AWS Glue cannot be use to create models which is option C. The correct answer is A
upvoted 31 times
...
liangfb
Highly Voted 3 years, 6 months ago
Answer is A.
upvoted 13 times
...
JonSno
Most Recent 2 months, 1 week ago
Selected Answer: A
Training locally on a small dataset ensures the training script and model parameters are working correctly. Amazon SageMaker training jobs allow direct access to S3 data without downloading everything. Pipe input mode efficiently streams data from S3 to the training instance, reducing disk space requirements and speeding up training.
upvoted 3 times
...
reginav
4 months, 1 week ago
Selected Answer: A
Only Pipe mode can stream data from S3
upvoted 1 times
...
Mickey321
7 months ago
Selected Answer: A
The reason for this choice is that Pipe input mode is a feature of Amazon SageMaker that allows you to stream data directly from an Amazon S3 bucket to your training instances without downloading it first1. This way, you can avoid the time and space limitations of loading a large dataset onto your notebook instance. Pipe input mode also offers faster start times and better throughput than File input mode, which downloads the entire dataset before training1.
upvoted 3 times
...
loict
7 months ago
Selected Answer: A
A. YES - pipe mode is best to start inference before the entire data is transferred; the only drawback is if multiple training jobs are done in sequence (eg. different hyperparamater), the data will be downloaded again B. NO - we want to use SageMaker first for initial training C. NO - We first want to test things in SageMaker D. NO - the SageMaker notebook will not use the AMI so the testing done is useless
upvoted 1 times
...
kyuhuck
1 year, 2 months ago
Selected Answer: B
B. Generate daily precision-recall data in Amazon QuickSight, and publish the results in a dashboard shared with the Business team. This solution leverages QuickSight's managed service capabilities for both data processing and visualization, which should minimize the coding effort required to provide the Business team with the necessary insights. However, it's important to note that QuickSight's ability to calculate the precision-recall data depends on its support for the necessary statistical functions or the availability of such calculations in the dataset. If QuickSight cannot perform these calculations directly, option C might be necessary, despite the increased effort.
upvoted 1 times
...
Venkatesh_Babu
1 year, 9 months ago
Selected Answer: A
I think it should be a
upvoted 1 times
...
Valcilio
2 years, 1 month ago
Selected Answer: A
It's A, pipe mode is for dealing with very big data.
upvoted 2 times
...
yemauricio
2 years, 4 months ago
Selected Answer: A
A, PIPE is to do that sort of modeling
upvoted 2 times
...
Shailendraa
2 years, 7 months ago
When data is already in S3 and next it should move to Sagemaker.. so option A is suitable
upvoted 1 times
...
Huy
3 years, 5 months ago
Answer is A. B, C & D can be dropped because there is no integration from/to Sage Maker train job (model).
upvoted 1 times
...
cloud_trail
3 years, 5 months ago
Gotta be A. You need to use Pipe mode but Glue cannot train a model.
upvoted 2 times
...
bobdylan1
3 years, 6 months ago
AAAAAAAAAAa
upvoted 1 times
...
Willnguyen22
3 years, 6 months ago
ans is A
upvoted 1 times
...
GeeBeeEl
3 years, 6 months ago
Will you run AWS Deep Learning AMI for all cases where the data is very large in S3? Also what role is Glue playing here? Is there a transformation? These are the two issues for options B C and D. I believe they do not represent what is required to satisfy the requirements in the question. The answer definitely requires the pipe mode, but not with Glue. I go with A https://aws.amazon.com/blogs/machine-learning/using-pipe-input-mode-for-amazon-sagemaker-algorithms/
upvoted 3 times
...
roytruong
3 years, 6 months ago
go for A
upvoted 2 times
...
Community vote distribution
A (35%)
C (25%)
B (20%)
Other
Most Voted
A voting comment increases the vote count for the chosen answer by one.

Upvoting a comment with a selected answer will also increase the vote count towards that answer by one. So if you see a comment that you already agree with, you can upvote it instead of posting a new comment.

SaveCancel
Loading ...
exam
Someone Bought Contributor Access for:
SY0-701
London, 1 minute ago