Welcome to ExamTopics
ExamTopics Logo
- Expert Verified, Online, Free.
exam questions

Exam Professional Data Engineer All Questions

View all questions & answers for the Professional Data Engineer exam

Exam Professional Data Engineer topic 1 question 277 discussion

Actual exam question from Google's Professional Data Engineer
Question #: 277
Topic #: 1
[All Professional Data Engineer Questions]

You are designing a real-time system for a ride hailing app that identifies areas with high demand for rides to effectively reroute available drivers to meet the demand. The system ingests data from multiple sources to Pub/Sub, processes the data, and stores the results for visualization and analysis in real-time dashboards. The data sources include driver location updates every 5 seconds and app-based booking events from riders. The data processing involves real-time aggregation of supply and demand data for the last 30 seconds, every 2 seconds, and storing the results in a low-latency system for visualization. What should you do?

  • A. Group the data by using a tumbling window in a Dataflow pipeline, and write the aggregated data to Memorystore.
  • B. Group the data by using a hopping window in a Dataflow pipeline, and write the aggregated data to Memorystore.
  • C. Group the data by using a session window in a Dataflow pipeline, and write the aggregated data to BigQuery.
  • D. Group the data by using a hopping window in a Dataflow pipeline, and write the aggregated data to BigQuery.
Show Suggested Answer Hide Answer
Suggested Answer: B 🗳️

Comments

Chosen Answer:
This is a voting comment (?). It is better to Upvote an existing comment if you don't have anything to add.
Switch to a voting comment New
raaad
Highly Voted 10 months, 3 weeks ago
Selected Answer: B
- Hopping Window: Hopping windows are fixed-sized, overlapping intervals. - Aggregate data over the last 30 seconds, every 2 seconds, as hopping windows allow for overlapping data analysis. - Memorystore: Ideal for low-latency access required for real-time visualization and analysis.
upvoted 8 times
anushree09
7 months, 2 weeks ago
Hopping windows are sliding windows. It makes sense to use that over tumbling (fixed) window because the ask is to collect last 30 seconds of data every 5 second
upvoted 2 times
...
...
Jeyaraj
Most Recent 4 months ago
OPTION A. (IGNORE MY Previous Comment) Tumbling windows are the best choice for this ride-hailing app because they provide accurate 2-second aggregations without the complexities of overlapping data. This is crucial for real-time decision-making and ensuring accurate visualization of supply and demand. Hopping windows introduce potential inaccuracies and complexity, making them less suitable for this scenario. While they can be useful in other situations, they are not the optimal choice for real-time aggregation with strict accuracy requirements.
upvoted 1 times
...
Jeyaraj
4 months ago
Option B. Tumbling windows are the best choice for this ride-hailing app because they provide accurate 2-second aggregations without the complexities of overlapping data. This is crucial for real-time decision-making and ensuring accurate visualization of supply and demand. Hopping windows introduce potential inaccuracies and complexity, making them less suitable for this scenario. While they can be useful in other situations, they are not the optimal choice for real-time aggregation with strict accuracy requirements.
upvoted 1 times
...
JyoGCP
9 months, 1 week ago
Selected Answer: B
Option B
upvoted 1 times
...
ashdam
9 months, 1 week ago
hopping window is clear but memorystore vs bigquery?? Why memorystore and not bigquery?
upvoted 1 times
ML6
9 months, 1 week ago
Memory store is an in-memory key-value database for use cases such as real-time application.
upvoted 1 times
ea2023
7 months, 2 weeks ago
Let me complete your answer MS vs BQ in this case is a matter of low latence where MS is the winner but if precision were stated about a large amount of data BQ then would"ve been the best choice.
upvoted 1 times
...
...
...
Jordan18
10 months ago
why not D?
upvoted 1 times
RenePetersen
9 months ago
Because BigQuery is not a low latency system...
upvoted 1 times
...
...
scaenruy
10 months, 3 weeks ago
Selected Answer: B
B. Group the data by using a hopping window in a Dataflow pipeline, and write the aggregated data to Memorystore.
upvoted 2 times
...
Community vote distribution
A (35%)
C (25%)
B (20%)
Other
Most Voted
A voting comment increases the vote count for the chosen answer by one.

Upvoting a comment with a selected answer will also increase the vote count towards that answer by one. So if you see a comment that you already agree with, you can upvote it instead of posting a new comment.

SaveCancel
Loading ...