All answers are wrong:
A - checkpoint directory to track changes to Delta table?
B - microbatch uses the state of the table at the time the query is executed, not at initialization
C - unique keys? - stream-static joins are not stateful, so we are only looking at the current batch of records
D - you can totally have stream-static joins, see: https://spark.apache.org/docs/latest/structured-streaming-programming-guide.html#support-matrix-for-joins-in-streaming-queries
I believe they made a typo in the B, that seems to be the only logical explanation.
If you look at question 18 you find that the correct solution should be Each microbatch of a stream-static join will use the most recent version of the static Delta table as of each microbatch. This is not listed here meaning that B could not be correct leading to A being the only possible solution.... The wrong part about B is that the latest version of the static delta table is returned at each micro-batch rather than as of job initialisation...
When Databricks processes a micro-batch of data in a stream-static join, the latest valid version of data from the static Delta table joins with the records present in the current micro-batch. Because the join is stateless, you do not need to configure watermarking and can process results with low latency. The data in the static Delta table used in the join should be slowly-changing.
https://docs.databricks.com/en/transform/join.html#stream-static
upvoted 3 times
...
Log in to ExamTopics
Sign in:
Community vote distribution
A (35%)
C (25%)
B (20%)
Other
Most Voted
A voting comment increases the vote count for the chosen answer by one.
Upvoting a comment with a selected answer will also increase the vote count towards that answer by one.
So if you see a comment that you already agree with, you can upvote it instead of posting a new comment.
arekm
1 month agobenni_ale
3 months agoMDWPartners
8 months, 1 week ago