Welcome to ExamTopics
ExamTopics Logo
- Expert Verified, Online, Free.
exam questions

Exam Certified Data Engineer Professional All Questions

View all questions & answers for the Certified Data Engineer Professional exam

Exam Certified Data Engineer Professional topic 1 question 108 discussion

Actual exam question from Databricks's Certified Data Engineer Professional
Question #: 108
Topic #: 1
[All Certified Data Engineer Professional Questions]

Which statement describes the default execution mode for Databricks Auto Loader?

  • A. Cloud vendor-specific queue storage and notification services are configured to track newly arriving files; the target table is materialized by directly querying all valid files in the source directory.
  • B. New files are identified by listing the input directory; the target table is materialized by directly querying all valid files in the source directory.
  • C. Webhooks trigger a Databricks job to run anytime new data arrives in a source directory; new data are automatically merged into target tables using rules inferred from the data.
  • D. New files are identified by listing the input directory; new files are incrementally and idempotently loaded into the target Delta Lake table.
  • E. Cloud vendor-specific queue storage and notification services are configured to track newly arriving files; new files are incrementally and idempotently loaded into the target Delta Lake table.
Show Suggested Answer Hide Answer
Suggested Answer: D 🗳️

Comments

Chosen Answer:
This is a voting comment (?) , you can switch to a simple comment.
Switch to a voting comment New
vctrhugo
Highly Voted 9 months, 3 weeks ago
Selected Answer: D
"Auto Loader uses directory listing mode by default. In directory listing mode, Auto Loader identifies new files by listing the input directory." https://learn.microsoft.com/en-us/azure/databricks/ingestion/auto-loader/directory-listing-mode
upvoted 6 times
...
Rinscy
Most Recent 10 months ago
D definitely ! Auto Loader is an optimized file source that overcomes all the above limitations and provides a seamless way for data teams to load the raw data at low cost and latency with minimal DevOps effort. You just need to provide a source directory path and start a streaming job. The new structured streaming source, called "cloudFiles", will automatically set up file notification services that subscribe file events from the input directory and process new files as they arrive, with the option of also processing existing files in that directory.
upvoted 2 times
csrazdan
2 months, 3 weeks ago
Correct answer is D. However, listing the input directory is the default way of identifying new files for auto loader. Cloud Native Notification services can be used but this is not default setting for auto loader.
upvoted 1 times
...
...
ranith
10 months ago
https://docs.databricks.com/en/ingestion/auto-loader/options.html#:~:text=By%20default%2C%20Auto%20Loader%20makes,as%20true%20or%20false%20respectively. Selected answer: D
upvoted 1 times
...
get_certified9
10 months ago
D is the answer. The default execution mode for Databricks Auto Loader is the Directory Listing mode
upvoted 1 times
...
spaceexplorer
10 months ago
Selected Answer: E
E is the answer
upvoted 1 times
spaceexplorer
10 months ago
https://www.databricks.com/blog/2020/02/24/introducing-databricks-ingest-easy-data-ingestion-into-delta-lake.html
upvoted 2 times
Isio05
5 months, 2 weeks ago
Surely it's not vendor specific solution
upvoted 1 times
...
...
...
Community vote distribution
A (35%)
C (25%)
B (20%)
Other
Most Voted
A voting comment increases the vote count for the chosen answer by one.

Upvoting a comment with a selected answer will also increase the vote count towards that answer by one. So if you see a comment that you already agree with, you can upvote it instead of posting a new comment.

SaveCancel
Loading ...