Exam Certified Data Engineer Professional All Questions

View all questions & answers for the Certified Data Engineer Professional exam

Exam Certified Data Engineer Professional topic 1 question 77 discussion

Actual exam question from Databricks's Certified Data Engineer Professional

Question #: 77
Topic #: 1

[All Certified Data Engineer Professional Questions]

In order to facilitate near real-time workloads, a data engineer is creating a helper function to leverage the schema detection and evolution functionality of Databricks Auto Loader. The desired function will automatically detect the schema of the source directly, incrementally process JSON files as they arrive in a source directory, and automatically evolve the schema of the table when new fields are detected.

The function is displayed below with a blank:

Which response correctly fills in the blank to meet the specified requirements?

Show Suggested Answer

Suggested Answer: E 🗳️

by sturcu at Oct. 25, 2023, 7:34 a.m.

Comments

Submit Cancel

benni_ale

4 months, 1 week ago

Selected Answer: E

Evolve Schema = mergeSchema option is needed ; Incrementally = checkpointing is needed; Real-Time = WriteStream with default trigger . The only option that catches all of these is E

upvoted 2 times

...

35fd6dd

8 months, 1 week ago

Selected Answer: E

write is not for spark streaming

upvoted 2 times

...

Freyr

10 months, 3 weeks ago

Selected Answer: E

Reference: https://docs.databricks.com/en/ingestion/auto-loader/schema.html writeStream: Ensures real-time streaming write capabilities, which is essential f or near real-time workloads. checkpointLocation: Necessary for fault tolerance and tracking progress. mergeSchema: Ensures automatic schema evolution, allowing new columns to be detected and added to the target table. Why Option 'C ' is incorrect? Uses write instead of writeStream, which is for batch processing, making it inappropriate for real-time streaming. Why Option 'B ' is incorrect? Although it includes checkpointLocation and mergeSchema, the addition of trigger(once=True) is not necessary in this context, and it is better suited for batch-like processing. Reference: https://docs.databricks.com/en/ingestion/auto-loader/schema.html

upvoted 4 times

...

vikram12apr

1 year, 1 month ago

Selected Answer: E

streamRead & StreamWrite shares the schema using checkpoint location so cloudFiles.schemaLocation needs to be same for checkpointLocation so that we dont need to specify it manually also mergeSchema True make sure if any new column detected , it will be added in the target table https://docs.databricks.com/en/ingestion/auto-loader/schema.html

upvoted 2 times

...

hal2401me

1 year, 1 month ago

Selected Answer: E

https://notebooks.databricks.com/demos/auto-loader/01-Auto-loader-schema-evolution-Ingestion.html

upvoted 2 times

...

aragorn_brego

1 year, 4 months ago

Selected Answer: E

This response correctly fills in the blank to meet the specified requirements of using Databricks Auto Loader for automatic schema detection and evolution in a near real-time streaming context.

upvoted 1 times

...

AzureDE2522

1 year, 5 months ago

Selected Answer: E

Please refer: https://docs.databricks.com/en/ingestion/auto-loader/schema.html

upvoted 3 times

...

Dileepvikram

1 year, 5 months ago

It does not mention to write as stream, it mentions to write incrementally, so option C looks correct for me

upvoted 1 times

...

mouad_attaqi

1 year, 5 months ago

Selected Answer: E

Correct answer is E, it is a streaming write, and the default outputMode is Append (so if it's optional in this case)

upvoted 2 times

...

sturcu

1 year, 5 months ago

there is a type in the statement. Is it schema or checkpoint ? Provided answer is not correct. It has to be a writestream, with mode append

upvoted 1 times

...

Exam Certified Data Engineer Professional All Questions

View all questions & answers for the Certified Data Engineer Professional exam

Exam Certified Data Engineer Professional topic 1 question 77 discussion

Comments

benni_ale

35fd6dd

Freyr

vikram12apr

hal2401me

aragorn_brego

AzureDE2522

Dileepvikram

mouad_attaqi

sturcu

SY0-701