exam questions

Exam Professional Data Engineer All Questions

View all questions & answers for the Professional Data Engineer exam

Exam Professional Data Engineer topic 1 question 282 discussion

Actual exam question from Google's Professional Data Engineer
Question #: 282
Topic #: 1
[All Professional Data Engineer Questions]

You are using a Dataflow streaming job to read messages from a message bus that does not support exactly-once delivery. Your job then applies some transformations, and loads the result into BigQuery. You want to ensure that your data is being streamed into BigQuery with exactly-once delivery semantics. You expect your ingestion throughput into BigQuery to be about 1.5 GB per second. What should you do?

  • A. Use the BigQuery Storage Write API and ensure that your target BigQuery table is regional.
  • B. Use the BigQuery Storage Write API and ensure that your target BigQuery table is multiregional.
  • C. Use the BigQuery Streaming API and ensure that your target BigQuery table is regional.
  • D. Use the BigQuery Streaming API and ensure that your target BigQuery table is multiregional.
Show Suggested Answer Hide Answer
Suggested Answer: B 🗳️

Comments

Chosen Answer:
This is a voting comment (?). It is better to Upvote an existing comment if you don't have anything to add.
Switch to a voting comment New
AlizCert
Highly Voted 7 months ago
Selected Answer: B
It should B, Storage Write API has "3 GB per second throughput in multi-regions; 300 MB per second in regions"
upvoted 15 times
...
raaad
Highly Voted 11 months, 4 weeks ago
Selected Answer: A
- BigQuery Storage Write API: This API is designed for high-throughput, low-latency writing of data into BigQuery. It also provides tools to prevent data duplication, which is essential for exactly-once delivery semantics. - Regional Table: Choosing a regional location for the BigQuery table could potentially provide better performance and lower latency, as it would be closer to the Dataflow job if they are in the same region.
upvoted 11 times
AllenChen123
11 months, 1 week ago
Agree. https://cloud.google.com/bigquery/docs/write-api#advantages
upvoted 4 times
...
...
hussain.sain
Most Recent 6 days, 10 hours ago
Selected Answer: B
B is correct. When aiming for exactly-once delivery in a Dataflow streaming job, the key is to use the BigQuery Storage Write API, as it provides the capability to handle large-scale data ingestion with the correct semantics, including exactly-once delivery.
upvoted 1 times
...
himadri1983
2 weeks, 5 days ago
Selected Answer: B
3 GB per second throughput in multi-regions; 300 MB per second in regions https://cloud.google.com/bigquery/quotas#write-api-limits
upvoted 1 times
...
m_a_p_s
3 weeks ago
Selected Answer: B
streamed into BigQuery with exactly-once delivery semantics >>> Storage Write API ingestion throughput into BigQuery to be about 1.5 GB per second >>> multiregional (check throughput rate here >>> https://cloud.google.com/bigquery/quotas#write-api-limits)
upvoted 1 times
...
NatyNogas
1 month ago
Selected Answer: A
- Choosing a regional target BigQuery table ensures that data is stored redundantly in a single region, providing high availability and durability.
upvoted 1 times
...
CloudAdrMX
1 month ago
Selected Answer: B
According to this documentation, its B https://cloud.google.com/bigquery/quotas#write-api-limits
upvoted 2 times
...
imazy
1 month, 3 weeks ago
Selected Answer: A
Write API support 2.5 GB / sec speed and support exactly-once delivery semantics https://cloud.google.com/bigquery/docs/write-api#connections whereas in streaming duplicates can come and needed to remove them manually https://cloud.google.com/bigquery/docs/streaming-data-into-bigquery#dataavailability
upvoted 1 times
...
SamuelTsch
2 months ago
Selected Answer: B
looking for this documentation https://cloud.google.com/bigquery/quotas#write-api-limits. 3 GB/s in multi-regions; 300MB/s in regions
upvoted 4 times
...
HermanTan
3 months ago
To ensure that analysts do not see customer data older than 30 days while minimizing cost and overhead, the best option is: B. Use a timestamp range filter in the query to fetch the customer’s data for a specific range. This approach directly addresses the issue by filtering out data older than 30 days at query time, ensuring that only the relevant data is retrieved. It avoids the overhead and potential delays associated with garbage collection and manual deletion processes
upvoted 2 times
...
hanoverquay
9 months, 3 weeks ago
Selected Answer: D
option D
upvoted 1 times
BennyXu
9 months ago
you are wrong!!!!!!!!!!!!
upvoted 1 times
...
...
Matt_108
11 months, 3 weeks ago
Selected Answer: A
Option A
upvoted 1 times
...
Ed_Kim
1 year ago
Selected Answer: A
Voting on A
upvoted 2 times
Smakyel79
12 months ago
This option leverages the BigQuery Storage Write API's capability for exactly-once delivery semantics and a regional table setting that can meet compliance and data locality needs without impacting the delivery semantics. The BigQuery Storage Write API is more suitable for your high-throughput requirements compared to the BigQuery Streaming API.
upvoted 4 times
...
...
Community vote distribution
A (35%)
C (25%)
B (20%)
Other
Most Voted
A voting comment increases the vote count for the chosen answer by one.

Upvoting a comment with a selected answer will also increase the vote count towards that answer by one. So if you see a comment that you already agree with, you can upvote it instead of posting a new comment.

SaveCancel
Loading ...
exam
Someone Bought Contributor Access for:
SY0-701
London, 1 minute ago