Welcome to ExamTopics
ExamTopics Logo
- Expert Verified, Online, Free.
exam questions

Exam Certified Data Engineer Professional All Questions

View all questions & answers for the Certified Data Engineer Professional exam

Exam Certified Data Engineer Professional topic 1 question 19 discussion

Actual exam question from Databricks's Certified Data Engineer Professional
Question #: 19
Topic #: 1
[All Certified Data Engineer Professional Questions]

A junior data engineer has been asked to develop a streaming data pipeline with a grouped aggregation using DataFrame df. The pipeline needs to calculate the average humidity and average temperature for each non-overlapping five-minute interval. Events are recorded once per minute per device.
Streaming DataFrame df has the following schema:
"device_id INT, event_time TIMESTAMP, temp FLOAT, humidity FLOAT"
Code block:

Choose the response that correctly fills in the blank within the code block to complete this task.

  • A. to_interval("event_time", "5 minutes").alias("time")
  • B. window("event_time", "5 minutes").alias("time")
  • C. "event_time"
  • D. window("event_time", "10 minutes").alias("time")
  • E. lag("event_time", "10 minutes").alias("time")
Show Suggested Answer Hide Answer
Suggested Answer: B 🗳️

Comments

Chosen Answer:
This is a voting comment (?) , you can switch to a simple comment.
Switch to a voting comment New
imatheushenrique
5 months, 4 weeks ago
B. window("event_time", "5 minutes").alias("time") In Structured Streaming, expressing such windows on event-time is simply performing a special grouping using the window() function. For example, counts over 5 minute tumbling (non-overlapping) windows on the eventTime column in the event is as following.
upvoted 1 times
...
Jay_98_11
10 months, 2 weeks ago
Selected Answer: B
correct B
upvoted 1 times
...
kz_data
10 months, 2 weeks ago
Selected Answer: B
B is correct
upvoted 1 times
...
BIKRAM063
1 year ago
Selected Answer: B
Window of 5 mins
upvoted 2 times
...
sturcu
1 year, 1 month ago
Selected Answer: B
B is correct: https://www.databricks.com/blog/2017/05/08/event-time-aggregation-watermarking-apache-sparks-structured-streaming.html
upvoted 2 times
...
Eertyy
1 year, 2 months ago
answer is B
upvoted 2 times
...
thxsgod
1 year, 2 months ago
Selected Answer: B
Correct, B.
upvoted 4 times
...
Community vote distribution
A (35%)
C (25%)
B (20%)
Other
Most Voted
A voting comment increases the vote count for the chosen answer by one.

Upvoting a comment with a selected answer will also increase the vote count towards that answer by one. So if you see a comment that you already agree with, you can upvote it instead of posting a new comment.

SaveCancel
Loading ...