Welcome to ExamTopics
ExamTopics Logo
- Expert Verified, Online, Free.
 

Databricks Certified Data Engineer Professional Exam Actual Questions

The questions for Certified Data Engineer Professional were last updated on Nov. 11, 2024.
  • Viewing page 1 out of 45 pages.
  • Viewing questions 1-4 out of 179 questions

Topic 1 - Exam A

Question #1 Topic 1

An upstream system has been configured to pass the date for a given batch of data to the Databricks Jobs API as a parameter. The notebook to be scheduled will use this parameter to load data with the following code: df = spark.read.format("parquet").load(f"/mnt/source/(date)")
Which code block should be used to create the date Python variable used in the above code block?

  • A. date = spark.conf.get("date")
  • B. input_dict = input()
    date= input_dict["date"]
  • C. import sys
    date = sys.argv[1]
  • D. date = dbutils.notebooks.getParam("date")
  • E. dbutils.widgets.text("date", "null")
    date = dbutils.widgets.get("date")
Reveal Solution Hide Solution   Discussion   18

Correct Answer: E 🗳️

Question #2 Topic 1

The Databricks workspace administrator has configured interactive clusters for each of the data engineering groups. To control costs, clusters are set to terminate after 30 minutes of inactivity. Each user should be able to execute workloads against their assigned clusters at any time of the day.
Assuming users have been added to a workspace but not granted any permissions, which of the following describes the minimal permissions a user would need to start and attach to an already configured cluster.

  • A. "Can Manage" privileges on the required cluster
  • B. Workspace Admin privileges, cluster creation allowed, "Can Attach To" privileges on the required cluster
  • C. Cluster creation allowed, "Can Attach To" privileges on the required cluster
  • D. "Can Restart" privileges on the required cluster
  • E. Cluster creation allowed, "Can Restart" privileges on the required cluster
Reveal Solution Hide Solution   Discussion   22

Correct Answer: D 🗳️

Question #3 Topic 1

When scheduling Structured Streaming jobs for production, which configuration automatically recovers from query failures and keeps costs low?

  • A. Cluster: New Job Cluster;
    Retries: Unlimited;
    Maximum Concurrent Runs: Unlimited
  • B. Cluster: New Job Cluster;
    Retries: None;
    Maximum Concurrent Runs: 1
  • C. Cluster: Existing All-Purpose Cluster;
    Retries: Unlimited;
    Maximum Concurrent Runs: 1
  • D. Cluster: New Job Cluster;
    Retries: Unlimited;
    Maximum Concurrent Runs: 1
  • E. Cluster: Existing All-Purpose Cluster;
    Retries: None;
    Maximum Concurrent Runs: 1
Reveal Solution Hide Solution   Discussion   9

Correct Answer: D 🗳️

Question #4 Topic 1

The data engineering team has configured a Databricks SQL query and alert to monitor the values in a Delta Lake table. The recent_sensor_recordings table contains an identifying sensor_id alongside the timestamp and temperature for the most recent 5 minutes of recordings.
The below query is used to create the alert:

The query is set to refresh each minute and always completes in less than 10 seconds. The alert is set to trigger when mean (temperature) > 120. Notifications are triggered to be sent at most every 1 minute.
If this alert raises notifications for 3 consecutive minutes and then stops, which statement must be true?

  • A. The total average temperature across all sensors exceeded 120 on three consecutive executions of the query
  • B. The recent_sensor_recordings table was unresponsive for three consecutive runs of the query
  • C. The source query failed to update properly for three consecutive minutes and then restarted
  • D. The maximum temperature recording for at least one sensor exceeded 120 on three consecutive executions of the query
  • E. The average temperature recordings for at least one sensor exceeded 120 on three consecutive executions of the query
Reveal Solution Hide Solution   Discussion   6

Correct Answer: E 🗳️

Community vote distribution
A (35%)
C (25%)
B (20%)
Other
Most Voted
A voting comment increases the vote count for the chosen answer by one.

Upvoting a comment with a selected answer will also increase the vote count towards that answer by one. So if you see a comment that you already agree with, you can upvote it instead of posting a new comment.

SaveCancel
Loading ...