A company is migrating a legacy application to an Amazon S3 based data lake. A data engineer reviewed data that is associated with the legacy application. The data engineer found that the legacy data contained some duplicate information.
The data engineer must identify and remove duplicate information from the legacy application data.
Which solution will meet these requirements with the LEAST operational overhead?
rralucard_
Highly Voted 1 year, 2 months ago_JP_
Most Recent 4 months, 1 week agoV0811
8 months, 3 weeks agoGiorgioGss
1 year, 1 month agoAesthet
1 year, 2 months ago