Unlimited Access

Get Unlimited Contributor Access to the all ExamTopics Exams!
Take advantage of PDF Files for 1000+ Exams along with community discussions and pass IT Certification Exams Easily.

Get Unlimited Access

Amazon Discussions

Exam AWS Certified Machine Learning - Specialty topic 1 question 10 discussion

Exam question from Amazon's AWS Certified Machine Learning - Specialty

Question #: 10
Topic #: 1

[All AWS Certified Machine Learning - Specialty Questions]

A Machine Learning Specialist has completed a proof of concept for a company using a small data sample, and now the Specialist is ready to implement an end- to-end solution in AWS using Amazon SageMaker. The historical training data is stored in Amazon RDS.
Which approach should the Specialist use for training a model using that data?

A. Write a direct connection to the SQL database within the notebook and pull data in
B. Push the data from Microsoft SQL Server to Amazon S3 using an AWS Data Pipeline and provide the S3 location within the notebook.
C. Move the data to Amazon DynamoDB and set up a connection to DynamoDB within the notebook to pull data in.
D. Move the data to Amazon ElastiCache using AWS DMS and set up a connection within the notebook to pull data in for fast access.

Show Suggested Answer

Suggested Answer: B 🗳️

by JayK at Jan. 4, 2020, 1:27 p.m.

Disclaimers:

- ExamTopics website is not related to, affiliated with, endorsed or authorized by Amazon.
- Trademarks, certification & product names are used for reference only and belong to Amazon.

Comments

Submit Cancel

JayK

Highly Voted 2 years, 7 months ago

Answer is B as the data for a SageMaker notebook needs to be from S3 and option B is the only option that says it. The only thing with option B is that it is talking of moving data from MS SQL Server not RDS

upvoted 30 times

mlyu

2 years, 7 months ago

https://www.slideshare.net/AmazonWebServices/train-models-on-amazon-sagemaker-using-data-not-from-amazon-s3-aim419-aws-reinvent-2018

upvoted 2 times

HaiHN

2 years, 6 months ago

Please look at the slide 14 of that link, although the data source from DynamoDB or RDS, it is still need to use AWS Glue to move the data to S3 for SageMaker to use. So, the right anwser should be B.

upvoted 2 times

...

jasonsunbao

2 years, 7 months ago

I agree. As from the ML developer guide I just read, it is the MYSQL RDS that can be used as SQL datasource.

upvoted 2 times

...

Denise123

Most Recent 1 month, 1 week ago

Selected Answer: A

For Amazon S3, you can import data from an Amazon S3 bucket as long as you have permissions to access the bucket. For Amazon Athena, you can access databases in your AWS Glue Data Catalog as long as you have permissions through your Amazon Athena workgroup. For Amazon RDS, if you have the AmazonSageMakerCanvasFullAccess policy attached to your user’s role, then you’ll be able to import data from your Amazon RDS databases into Canvas. https://docs.aws.amazon.com/sagemaker/latest/dg/canvas-connecting-external.html

upvoted 3 times

Aja1

2 weeks, 3 days ago

https://aws.amazon.com/about-aws/whats-new/2024/04/amazon-sagemaker-studio-notebooks-data-sql-query/

upvoted 1 times

...

loict

8 months ago

Selected Answer: B

A. NO - SageMaker can only read from S3 B. YES - AWS Data Pipeline can moved from SQL Server to S3 C. NO - SageMaker can only read from S3 and not DynamoDB D. NO - SageMaker can only read from S3 and not ElastiCache

upvoted 2 times

...

Mickey321

9 months, 2 weeks ago

Selected Answer: B

This approach is the most scalable and reliable way to train a model using data stored in Amazon RDS. Amazon S3 is a highly scalable and durable object storage service, and Amazon Data Pipeline is a managed service that makes it easy to move data between different AWS services. By pushing the data to Amazon S3, the Specialist can ensure that the data is available for training the model even if the Amazon RDS instance is unavailable.

upvoted 1 times

...

Venkatesh_Babu

9 months, 3 weeks ago

Selected Answer: B

I think it should be b

upvoted 1 times

...

Valcilio

1 year, 2 months ago

Selected Answer: B

It's B, even if Microsoft SQL Server is a strange name for RDS, it's a possible database to use there and the data for sagemaker needs to be in S3!

upvoted 1 times

...

AjoseO

1 year, 3 months ago

Selected Answer: B

In Option B approach, the Specialist can use AWS Data Pipeline to automate the movement of data from Amazon RDS to Amazon S3. This allows for the creation of a reliable and scalable data pipeline that can handle large amounts of data and ensure the data is available for training. In the Amazon SageMaker notebook, the Specialist can then access the data stored in Amazon S3 and use it for training the model. Using Amazon S3 as the source of training data is a common and scalable approach, and it also provides durability and high availability of the data.

upvoted 2 times

...

SophieSu

2 years, 6 months ago

B is the correct answer. Official AWS Documentation: "Amazon ML allows you to create a datasource object from data stored in a MySQL database in Amazon Relational Database Service (Amazon RDS). When you perform this action, Amazon ML creates an AWS Data Pipeline object that executes the SQL query that you specify, and places the output into an S3 bucket of your choice. Amazon ML uses that data to create the datasource."

upvoted 2 times

...

cnethers

2 years, 6 months ago

While B is a valid answer, It is also possible to make a SQL connection in a notebook and create a data object so A could be a valid answer too https://stackoverflow.com/questions/36021385/connecting-from-python-to-sql-server https://www.mssqltips.com/sqlservertip/6120/data-exploration-with-python-and-sql-server-using-jupyter-notebooks/

upvoted 2 times

gcpwhiz

2 years, 6 months ago

you need to choose the best answer, not any valid answer. Often, many of the answers are valid solutions, but are not best practice.

upvoted 2 times

...

scuzzy2010

2 years, 6 months ago

B is correct. MS SQL Server is also under RDS.

upvoted 2 times

...

roytruong

2 years, 6 months ago

B is right

upvoted 2 times

...

bhavesh0124

2 years, 7 months ago

B it is

upvoted 1 times

...

cybe001

2 years, 7 months ago

I'll go with B

upvoted 2 times

...

Unlimited Access

Exam AWS Certified Machine Learning - Specialty topic 1 question 10 discussion

Comments

JayK

mlyu

HaiHN

jasonsunbao

Denise123

Aja1

loict

Mickey321

Venkatesh_Babu

Valcilio

AjoseO

SophieSu

cnethers

gcpwhiz

scuzzy2010

roytruong

bhavesh0124

cybe001

Get IT Certification

New Version GCP Professional Cloud Architect Certificate & Helpful Information

The 5 Most In-Demand Project Management Certifications of 2019