Welcome to ExamTopics
ExamTopics Logo
- Expert Verified, Online, Free.
exam questions

Exam Professional Data Engineer All Questions

View all questions & answers for the Professional Data Engineer exam

Exam Professional Data Engineer topic 1 question 303 discussion

Actual exam question from Google's Professional Data Engineer
Question #: 303
Topic #: 1
[All Professional Data Engineer Questions]

You are managing a Dataplex environment with raw and curated zones. A data engineering team is uploading JSON and CSV files to a bucket asset in the curated zone but the files are not being automatically discovered by Dataplex. What should you do to ensure that the files are discovered by Dataplex?

  • A. Move the JSON and CSV files to the raw zone.
  • B. Enable auto-discovery of files for the curated zone.
  • C. Use the bg command-line tool to load the JSON and CSV files into BigQuery tables.
  • D. Grant object level access to the CSV and JSON files in Cloud Storage.
Show Suggested Answer Hide Answer
Suggested Answer: A 🗳️

Comments

Chosen Answer:
This is a voting comment (?) , you can switch to a simple comment.
Switch to a voting comment New
GCP001
Highly Voted 10 months, 1 week ago
Selected Answer: A
Should be A. Curated zone need Parquet, Avro, ORC format not CSV or JSON. Check the ref - https://cloud.google.com/dataplex/docs/add-zone#curated-zones
upvoted 22 times
...
raaad
Highly Voted 10 months, 3 weeks ago
Selected Answer: B
- Auto-Discovery Feature: Dataplex has an auto-discovery feature that, when enabled, automatically discovers and catalogs data assets within a zone. - Appropriate for Both Raw and Curated Zones: This feature is applicable to both raw and curated zones, and it should be tailored to the specific data governance and cataloging needs of the organization.
upvoted 6 times
...
SamuelTsch
Most Recent 3 weeks ago
Selected Answer: A
Raw zones store structured data, semi-structured data such as CSV files and JSON files, and unstructured data in any format from external sources. Curated zones store structured data. Data can be stored in Cloud Storage buckets or BigQuery datasets. Supported formats for Cloud Storage buckets include Parquet, Avro, and ORC.
upvoted 1 times
...
rajnairds
3 months ago
Selected Answer: B
Discovery configuration Discovery is enabled by default when you create a new zone or asset. You can disable Discovery at the zone or asset level. For each Dataplex asset with Discovery enabled, Dataplex does the following: Scans the data associated with the asset. Groups structured and semi-structured files into tables. Collects technical metadata, such as table name, schema, and partition definition. For unstructured data, such as images and videos, Dataplex Discovery automatically detects and registers groups of files sharing media type as filesets. For example, if gs://images/group1 contains GIF images, and gs://images/group2 contains JPEG images, Dataplex Discovery detects and registers two filesets. For structured data, such as Avro, Discovery detects files only if they are located in folders that contain the same data format and schema. Reference : https://cloud.google.com/dataplex/docs/discover-data#exclude-files-from-Discovery
upvoted 1 times
...
hussain.sain
4 months, 3 weeks ago
Selected Answer: B
While JSON and CSV can technically be stored in curated zones, it is not a common practice due to the reasons mentioned above. no where in the mention link its mention that there is a restriction.
upvoted 2 times
...
Anudeep58
5 months, 1 week ago
Selected Answer: A
While none of the original options (A, B, C, or D) directly address the issue, the closest solution is: Move the JSON and CSV files to a raw zone. (This was previously marked as the most voted option, but it's not ideal due to data organization disruption) Here's why this approach might be necessary (but not ideal): Dataplex curated zones currently don't support native processing of JSON and CSV formats. They are designed for structured data formats like Parquet, Avro, or ORC.
upvoted 4 times
...
chrissamharris
6 months, 3 weeks ago
Selected Answer: A
Option A https://cloud.google.com/dataplex/docs/add-zone#raw-zones Raw zones are the only zones that support CSV & JSON
upvoted 1 times
...
joao_01
7 months, 2 weeks ago
Its B guys, i encounter this in my job, and I had to do B to make it work
upvoted 1 times
joao_01
7 months, 2 weeks ago
Actually I did this in a Raw zone, not Curated.
upvoted 1 times
joao_01
7 months, 2 weeks ago
Its A :)
upvoted 3 times
...
...
...
demoro86
8 months, 3 weeks ago
Selected Answer: A
GCP001 agree with him
upvoted 2 times
...
Moss2011
8 months, 3 weeks ago
Selected Answer: A
The answer can be found reading a common config of Dataplex in this URL: https://medium.com/google-cloud/google-cloud-dataplex-part-1-lakes-zones-assets-and-discovery-5f288486cb2f
upvoted 2 times
...
kck6ra4214wm
8 months, 4 weeks ago
Selected Answer: A
Dataplex does not allow users to create CSV files within a “curated zone”
upvoted 1 times
...
daidai75
9 months ago
Selected Answer: B
According to this URL: https://cloud.google.com/dataplex/docs/discover-data, the auto-discovery can support CSV and Json in both Raw-Zone and Curated-Zone. I also open a console the verify it, both Raw and Curated zone can set up csv&json auto-discovery.
upvoted 2 times
...
dungct
9 months, 2 weeks ago
Selected Answer: B
Discovery raises the following administrator actions whenever data-related issues are detected during scans : Inconsistent data format in a table. For example, files of different formats exist with the same table prefix. Inconsistent data format in a table. For example, files of different formats exist with the same table prefix.
upvoted 3 times
dungct
9 months, 2 weeks ago
https://cloud.google.com/dataplex/docs/discover-data#invalid_data_format
upvoted 3 times
...
...
Matt_108
10 months, 2 weeks ago
Selected Answer: B
I'd go for Option B, auto-discovery is enabled by default for any zone, including curated ones, so if a file is not automatically discovered it's due to the disabled auto-discovery
upvoted 4 times
ML6
9 months, 1 week ago
In this case, it would be because of invalid data format in curated zones (data not in Avro, Parquet, or ORC formats).
upvoted 1 times
...
...
Sofiia98
10 months, 2 weeks ago
Selected Answer: A
I will go with A, check the ref. Curated zones only store Parquet, Avro, and ORC in CS, and well-defined schema and Hive-style partitions in the BigQuery: https://cloud.google.com/dataplex/docs/add-zone#curated-zones
upvoted 3 times
...
scaenruy
10 months, 3 weeks ago
Selected Answer: A
A. Move the JSON and CSV files to the raw zone.
upvoted 2 times
...
Community vote distribution
A (35%)
C (25%)
B (20%)
Other
Most Voted
A voting comment increases the vote count for the chosen answer by one.

Upvoting a comment with a selected answer will also increase the vote count towards that answer by one. So if you see a comment that you already agree with, you can upvote it instead of posting a new comment.

SaveCancel
Loading ...