Exam AWS Certified Data Engineer - Associate DEA-C01 topic 1 question 148 discussion

Exam question from Amazon's AWS Certified Data Engineer - Associate DEA-C01

Question #: 148
Topic #: 1

[All AWS Certified Data Engineer - Associate DEA-C01 Questions]

An investment company needs to manage and extract insights from a volume of semi-structured data that grows continuously.

A data engineer needs to deduplicate the semi-structured data, remove records that are duplicates, and remove common misspellings of duplicates.

Which solution will meet these requirements with the LEAST operational overhead?

A. Use the FindMatches feature of AWS Glue to remove duplicate records.
B. Use non-Windows functions in Amazon Athena to remove duplicate records.
C. Use Amazon Neptune ML and an Apache Gremlin script to remove duplicate records.
D. Use the global tables feature of Amazon DynamoDB to prevent duplicate data.

Show Suggested Answer

Suggested Answer: A 🗳️

by Fawk at Sept. 19, 2024, 2:04 a.m.

Disclaimers:

- ExamTopics website is not related to, affiliated with, endorsed or authorized by Amazon.
- Trademarks, certification & product names are used for reference only and belong to Amazon.

Comments

Submit Cancel

italiancloud2025

2 months, 1 week ago

Selected Answer: A

A: Sí, porque AWS Glue FindMatches utiliza machine learning para deduplicar datos y corregir errores ortográficos con mínima sobrecarga operativa. B: No, usar Athena requiere escribir consultas manuales y no maneja bien las variaciones de escritura. C: No, Neptune ML está orientado a análisis en grafos, no a la deduplicación de datos semi-estructurados. D: No, global tables en DynamoDB se usan para replicación, no para eliminar duplicados.

upvoted 1 times

...

Fawk

7 months, 1 week ago

Selected Answer: A

A - The other options are dumb and hardly make sense

upvoted 2 times

...

Exam AWS Certified Data Engineer - Associate DEA-C01 All Questions

View all questions & answers for the AWS Certified Data Engineer - Associate DEA-C01 exam

Exam AWS Certified Data Engineer - Associate DEA-C01 topic 1 question 148 discussion

Comments

italiancloud2025

Fawk

SY0-701