exam questions

Exam AWS Certified Machine Learning Engineer - Associate MLA-C01 All Questions

View all questions & answers for the AWS Certified Machine Learning Engineer - Associate MLA-C01 exam

Exam AWS Certified Machine Learning Engineer - Associate MLA-C01 topic 1 question 87 discussion

A company has an Amazon S3 bucket that contains 1 ТВ of files from different sources. The S3 bucket contains the following file types in the same S3 folder: CSV, JSON, XLSX, and Apache Parquet.

An ML engineer must implement a solution that uses AWS Glue DataBrew to process the data. The ML engineer also must store the final output in Amazon S3 so that AWS Glue can consume the output in the future.

Which solution will meet these requirements?

  • A. Use DataBrew to process the existing S3 folder. Store the output in Apache Parquet format.
  • B. Use DataBrew to process the existing S3 folder. Store the output in AWS Glue Parquet format.
  • C. Separate the data into a different folder for each file type. Use DataBrew to process each folder individually. Store the output in Apache Parquet format.
  • D. Separate the data into a different folder for each file type. Use DataBrew to process each folder individually. Store the output in AWS Glue Parquet format.
Show Suggested Answer Hide Answer
Suggested Answer: C 🗳️

Comments

Chosen Answer:
This is a voting comment (?). It is better to Upvote an existing comment if you don't have anything to add.
Switch to a voting comment New
ryuhei
3 days, 14 hours ago
Selected Answer: C
AWS Glue DataBrew can process various file formats (CSV, JSON, XLSX, Parquet) Since DataBrew can handle datasets with multiple file formats, there is no need to separate files into different folders by type. Apache Parquet is an optimal format for AWS Glue Parquet is a columnar format, which is well-suited for AWS Glue and is efficient for later analysis and ML model training. "AWS Glue Parquet format" does not exist Options B and D mention "AWS Glue Parquet format," which is incorrect. Parquet is a standard data format and is not exclusive to AWS Glue. ✅ Conclusion: Option A is the best solution because it allows DataBrew to process all files in the existing S3 folder and store the output in Apache Parquet format, which is efficient and compatible with AWS Glue. 🚀
upvoted 2 times
...
Community vote distribution
A (35%)
C (25%)
B (20%)
Other
Most Voted
A voting comment increases the vote count for the chosen answer by one.

Upvoting a comment with a selected answer will also increase the vote count towards that answer by one. So if you see a comment that you already agree with, you can upvote it instead of posting a new comment.

SaveCancel
Loading ...
exam
Someone Bought Contributor Access for:
SY0-701
London, 1 minute ago