Welcome to ExamTopics
ExamTopics Logo
- Expert Verified, Online, Free.

Exam AWS Certified Data Engineer - Associate DEA-C01 topic 1 question 108 discussion

A data engineer needs to debug an AWS Glue job that reads from Amazon S3 and writes to Amazon Redshift. The data engineer enabled the bookmark feature for the AWS Glue job.
The data engineer has set the maximum concurrency for the AWS Glue job to 1.

The AWS Glue job is successfully writing the output to Amazon Redshift. However, the Amazon S3 files that were loaded during previous runs of the AWS Glue job are being reprocessed by subsequent runs.

What is the likely reason the AWS Glue job is reprocessing the files?

  • A. The AWS Glue job does not have the s3:GetObjectAcl permission that is required for bookmarks to work correctly.
  • B. The maximum concurrency for the AWS Glue job is set to 1.
  • C. The data engineer incorrectly specified an older version of AWS Glue for the Glue job.
  • D. The AWS Glue job does not have a required commit statement.
Show Suggested Answer Hide Answer
Suggested Answer: A 🗳️

Comments

Chosen Answer:
This is a voting comment (?) , you can switch to a simple comment.
Switch to a voting comment New
lool
Highly Voted 2 months, 2 weeks ago
Selected Answer: D
https://docs.aws.amazon.com/glue/latest/dg/glue-troubleshooting-errors.html#error-job-bookmarks-reprocess-data
upvoted 5 times
...
azure_bimonster
Most Recent 1 day, 3 hours ago
Selected Answer: A
I would go with A option
upvoted 1 times
...
EJGisME
1 week, 5 days ago
Selected Answer: A
A. The AWS Glue job does not have the s3:GetObjectAcl permission that is required for bookmarks to work correctly.
upvoted 1 times
...
mzansikiller
1 month ago
Selected Answer: A
Answer A this is a job bookmarks permissions issue
upvoted 1 times
...
antun3ra
1 month, 1 week ago
Selected Answer: A
For AWS Glue bookmarks to function correctly, the job needs the necessary permissions to read and write bookmark data, including the s3:GetObjectAcl permission. If these permissions are not correctly set, the job may not be able to track which files have already been processed, leading to reprocessing of previously processed files.
upvoted 4 times
...
andrologin
2 months ago
Selected Answer: D
AWS Glue Job requires the commit statement to save the last successful run/processing
upvoted 1 times
...
HunkyBunky
2 months, 2 weeks ago
Selected Answer: D
For me - D looks correct
upvoted 2 times
...
Alagong
2 months, 2 weeks ago
Selected Answer: A
The commit statement (Option D) is not required for AWS Glue jobs. AWS Glue commits any open transactions to the database when all the script statements finish running.
upvoted 3 times
andrologin
2 months ago
It is the commit statement that ensures AWS saves the last successful processing
upvoted 1 times
...
HunkyBunky
2 months, 2 weeks ago
I've not found any information that s3:GetObjectACL is necessary for Glue bookmarks, so I'm pretty sure that A is wrong
upvoted 1 times
...
...
Bmaster
2 months, 3 weeks ago
D is good https://docs.aws.amazon.com/glue/latest/dg/glue-troubleshooting-errors.html#error-job-bookmarks-reprocess-data
upvoted 4 times
...
Community vote distribution
A (35%)
C (25%)
B (20%)
Other
Most Voted
A voting comment increases the vote count for the chosen answer by one.

Upvoting a comment with a selected answer will also increase the vote count towards that answer by one. So if you see a comment that you already agree with, you can upvote it instead of posting a new comment.

SaveCancel
Loading ...