exam questions

Exam AWS Certified Machine Learning - Specialty All Questions

View all questions & answers for the AWS Certified Machine Learning - Specialty exam

Exam AWS Certified Machine Learning - Specialty topic 1 question 248 discussion

A data scientist is working on a forecast problem by using a dataset that consists of .csv files that are stored in Amazon S3. The files contain a timestamp variable in the following format:


March 1st, 2020, 08:14pm -

There is a hypothesis about seasonal differences in the dependent variable. This number could be higher or lower for weekdays because some days and hours present varying values, so the day of the week, month, or hour could be an important factor. As a result, the data scientist needs to transform the timestamp into weekdays, month, and day as three separate variables to conduct an analysis.

Which solution requires the LEAST operational overhead to create a new dataset with the added features?

  • A. Create an Amazon EMR cluster. Develop PySpark code that can read the timestamp variable as a string, transform and create the new variables, and save the dataset as a new file in Amazon S3.
  • B. Create a processing job in Amazon SageMaker. Develop Python code that can read the timestamp variable as a string, transform and create the new variables, and save the dataset as a new file in Amazon S3.
  • C. Create a new flow in Amazon SageMaker Data Wrangler. Import the S3 file, use the Featurize date/time transform to generate the new variables, and save the dataset as a new file in Amazon S3.
  • D. Create an AWS Glue job. Develop code that can read the timestamp variable as a string, transform and create the new variables, and save the dataset as a new file in Amazon S3.
Show Suggested Answer Hide Answer
Suggested Answer: C 🗳️

Comments

Chosen Answer:
This is a voting comment (?). It is better to Upvote an existing comment if you don't have anything to add.
Switch to a voting comment New
vkbajoria
7 months ago
Selected Answer: C
Data Wrangler can transform the date time to desire features
upvoted 1 times
...
Mickey321
1 year, 2 months ago
Selected Answer: C
Amazon SageMaker Data Wrangler is a visual data preparation tool that makes it easy to clean, transform, and featurize data for machine learning. It provides a variety of built-in transformations, including the Featurize date/time transform, which can be used to generate the new variables from the timestamp variable. The other options require the data scientist to develop code, which can be more time-consuming and error-prone. Amazon EMR and AWS Glue are both batch processing services that can be used to run Python code. However, they require the data scientist to create and manage a cluster, which can be a significant operational overhead. Amazon SageMaker Processing is a serverless processing service that can also be used to run Python code. However, it is more expensive than Data Wrangler and does not provide the same level of visual tooling.
upvoted 3 times
...
kaike_reis
1 year, 2 months ago
Selected Answer: C
Letra C é a correta, pois o Data Wrangler permite low code para realizar esta tarefa e como queremos o menor operational overhead esta é a solução. Letra D também é possível, mas envolve desenvolvimento de código ficando mais complexa que a Letra C. Letra A requer subir um novo serviço e Letra B cai no mesmo cenário da Letra D (desenvolver código).
upvoted 2 times
...
asdfzxc
1 year, 4 months ago
Selected Answer: C
https://aws.amazon.com/blogs/machine-learning/prepare-time-series-data-with-amazon-sagemaker-data-wrangler/ "Featurize datetime time series transformation to add the month, day of the month, day of the year, week of the year, and quarter features to our dataset."
upvoted 3 times
...
Community vote distribution
A (35%)
C (25%)
B (20%)
Other
Most Voted
A voting comment increases the vote count for the chosen answer by one.

Upvoting a comment with a selected answer will also increase the vote count towards that answer by one. So if you see a comment that you already agree with, you can upvote it instead of posting a new comment.

SaveCancel
Loading ...
exam
Someone Bought Contributor Access for:
SY0-701
London, 1 minute ago