Welcome to ExamTopics
ExamTopics Logo
- Expert Verified, Online, Free.
exam questions

Exam Certified Data Engineer Professional All Questions

View all questions & answers for the Certified Data Engineer Professional exam

Exam Certified Data Engineer Professional topic 1 question 61 discussion

Actual exam question from Databricks's Certified Data Engineer Professional
Question #: 61
Topic #: 1
[All Certified Data Engineer Professional Questions]

A member of the data engineering team has submitted a short notebook that they wish to schedule as part of a larger data pipeline. Assume that the commands provided below produce the logically correct results when run as presented.



Which command should be removed from the notebook before scheduling it as a job?

  • A. Cmd 2
  • B. Cmd 3
  • C. Cmd 4
  • D. Cmd 5
  • E. Cmd 6
Show Suggested Answer Hide Answer
Suggested Answer: E 🗳️

Comments

Chosen Answer:
This is a voting comment (?) , you can switch to a simple comment.
Switch to a voting comment New
petrv
Highly Voted 12 months ago
Selected Answer: E
When scheduling a Databricks notebook as a job, it's generally recommended to remove or modify commands that involve displaying output, such as using the display() function. Displaying data using display() is an interactive feature designed for exploration and visualization within the notebook interface and may not work well in a production job context. The finalDF.explain() command, which provides the execution plan of the DataFrame transformations and actions, is often useful for debugging and optimizing queries. While it doesn't display interactive visualizations like display(), it can still be informative for understanding how Spark is executing the operations on your DataFrame.
upvoted 7 times
...
Carkeys
Most Recent 4 weeks, 1 day ago
Selected Answer: D
Cmd 5 (finalDF.explain()) is used for debugging and understanding the logical and physical plans of a DataFrame. It provides insights into how Spark plans to execute the query but does not produce output that is necessary for the scheduled job. Including this command in a scheduled job is unnecessary and could clutter the job logs without adding value to the final output.
upvoted 1 times
...
benni_ale
1 month, 1 week ago
Selected Answer: E
if i was multiple solutions than i would have gone for .explain method and print schema as well as they do not contribute in any sort of ETL operation but as a rule of thumb display should always be omitted first so -> E
upvoted 1 times
...
71dfab9
3 months, 1 week ago
Selected Answer: E
I agree with petrv and KhoaLe, but I will add that not displaying the finalDF would be wise as it could display and log PII data and that to me is why I choose E. Like hal2401 said, commands 2, 5 & 6 can be removed as they don't manipulate the data.
upvoted 1 times
...
hal2401me
9 months ago
Selected Answer: E
perhaps it's a multi-choice question in exam. I'll select E and D. if single choice then E.
upvoted 1 times
...
KhoaLe
9 months, 2 weeks ago
Selected Answer: E
Looking through at all steps, Cmd 2,5,6 can be eliminated without impacting to the whole process. However, in terms of duration cost, Cmd 2 and 5 does not impact much as they only show the current results of logical query plan. In contrast, display() in Cmd6 is actually a transformation, which will take much time to run.
upvoted 2 times
...
alexvno
11 months, 1 week ago
Selected Answer: E
No display()
upvoted 3 times
...
60ties
1 year ago
Selected Answer: D
No actions on production scripts. D is best
upvoted 1 times
ofed
1 year ago
in order to display a dataframe you also need to calculate it. So display also acts as an action.
upvoted 1 times
...
...
Karen1232123
1 year ago
Why not D?
upvoted 2 times
...
Community vote distribution
A (35%)
C (25%)
B (20%)
Other
Most Voted
A voting comment increases the vote count for the chosen answer by one.

Upvoting a comment with a selected answer will also increase the vote count towards that answer by one. So if you see a comment that you already agree with, you can upvote it instead of posting a new comment.

SaveCancel
Loading ...