exam questions

Exam AWS Certified Data Engineer - Associate DEA-C01 All Questions

View all questions & answers for the AWS Certified Data Engineer - Associate DEA-C01 exam

Exam AWS Certified Data Engineer - Associate DEA-C01 topic 1 question 28 discussion

A company uses an Amazon QuickSight dashboard to monitor usage of one of the company's applications. The company uses AWS Glue jobs to process data for the dashboard. The company stores the data in a single Amazon S3 bucket. The company adds new data every day.
A data engineer discovers that dashboard queries are becoming slower over time. The data engineer determines that the root cause of the slowing queries is long-running AWS Glue jobs.
Which actions should the data engineer take to improve the performance of the AWS Glue jobs? (Choose two.)

  • A. Partition the data that is in the S3 bucket. Organize the data by year, month, and day.
  • B. Increase the AWS Glue instance size by scaling up the worker type.
  • C. Convert the AWS Glue schema to the DynamicFrame schema class.
  • D. Adjust AWS Glue job scheduling frequency so the jobs run half as many times each day.
  • E. Modify the IAM role that grants access to AWS glue to grant access to all S3 features.
Show Suggested Answer Hide Answer
Suggested Answer: AB 🗳️

Comments

Chosen Answer:
This is a voting comment (?). It is better to Upvote an existing comment if you don't have anything to add.
Switch to a voting comment New
rralucard_
Highly Voted 8 months, 4 weeks ago
Selected Answer: AB
A. Partition the data that is in the S3 bucket. Organize the data by year, month, and day. • Partitioning data in Amazon S3 can significantly improve query performance. By organizing the data by year, month, and day, AWS Glue and Amazon QuickSight can scan only the relevant partitions of data, which reduces the amount of data read and processed. This approach is particularly effective for time-series data, where queries often target specific time ranges. B. Increase the AWS Glue instance size by scaling up the worker type. • Scaling up the worker type can provide more computational resources to the AWS Glue jobs, enabling them to process data faster. This can be especially beneficial when dealing with large datasets or complex transformations. It’s important to monitor the performance improvements and cost implications of scaling up.
upvoted 10 times
MLOPS_eng
4 months ago
How does partitioning data in S3 improve the performance of AWS Glue jobs? Partitioning data s3 improve the query performance, but the question was the action should the DE take to improve the performance of AWS Glue jobs !
upvoted 1 times
...
Leo87656789
6 months, 2 weeks ago
I would also go for A, B. But there are no worker types in AWS Glue. You can only increase the DPU.
upvoted 1 times
DevoteamAnalytix
5 months, 3 weeks ago
Here you can find 5 different Worker types: https://docs.aws.amazon.com/glue/latest/dg/add-job.html
upvoted 2 times
...
tgv
5 months ago
It looks like there are various worker types in AWS Glue actually. I'll go with AB as well. "With AWS Glue, you only pay for the time your ETL job takes to run. There are no resources to manage, no upfront costs, and you are not charged for startup or shutdown time. You are charged an hourly rate based on the number of Data Processing Units (or DPUs) used to run your ETL job. A single Data Processing Unit (DPU) is also referred to as a worker. AWS Glue comes with three worker types to help you select the configuration that meets your job latency and cost requirements. Workers come in Standard, G.1X, G.2X, and G.025X configurations." https://docs.aws.amazon.com/glue/latest/dg/components-key-concepts.html
upvoted 2 times
...
...
...
certplan
Most Recent 7 months, 1 week ago
1. **Partition the Data in Amazon S3**: - AWS documentation on optimizing Amazon S3 performance: https://docs.aws.amazon.com/AmazonS3/latest/userguide/optimizing-performance.html - AWS Glue documentation on partitioning data for AWS Glue jobs: https://docs.aws.amazon.com/glue/latest/dg/how-it-works.html#how-partitioning-works - Best practices for partitioning in Amazon S3: https://docs.aws.amazon.com/AmazonS3/latest/userguide/best-practices-partitioning.html 2. **Optimizing AWS Glue Job Settings**: - AWS Glue documentation on optimizing job performance: https://docs.aws.amazon.com/glue/latest/dg/best-practices.html - AWS Glue documentation on scaling AWS Glue job resources: https://docs.aws.amazon.com/glue/latest/dg/monitor-profile-glue-job-cloudwatch-metrics.html By referring to these documentation resources, the data engineer can gain insights into best practices and recommendations provided by AWS for optimizing AWS Glue jobs, thereby justifying the suggested actions to address the issue of slowing job performance.
upvoted 2 times
...
Community vote distribution
A (35%)
C (25%)
B (20%)
Other
Most Voted
A voting comment increases the vote count for the chosen answer by one.

Upvoting a comment with a selected answer will also increase the vote count towards that answer by one. So if you see a comment that you already agree with, you can upvote it instead of posting a new comment.

SaveCancel
Loading ...
exam
Someone Bought Contributor Access for:
SY0-701
London, 1 minute ago