Welcome to ExamTopics
ExamTopics Logo
- Expert Verified, Online, Free.
exam questions

Exam Professional Cloud Developer All Questions

View all questions & answers for the Professional Cloud Developer exam

Exam Professional Cloud Developer topic 1 question 13 discussion

Actual exam question from Google's Professional Cloud Developer
Question #: 13
Topic #: 1
[All Professional Cloud Developer Questions]

Your teammate has asked you to review the code below. Its purpose is to efficiently add a large number of small rows to a BigQuery table.

Which improvement should you suggest your teammate make?

  • A. Include multiple rows with each request.
  • B. Perform the inserts in parallel by creating multiple threads.
  • C. Write each row to a Cloud Storage object, then load into BigQuery.
  • D. Write each row to a Cloud Storage object in parallel, then load into BigQuery.
Show Suggested Answer Hide Answer
Suggested Answer: A 🗳️

Comments

Chosen Answer:
This is a voting comment (?). It is better to Upvote an existing comment if you don't have anything to add.
Switch to a voting comment New
fraloca
Highly Voted 3 years, 11 months ago
For me the correct answer is A. Infact the loop build a single InsertReqeust and send it. But we can build all request in a list and use InsertAllRequest.newBuilder(tableId).setRows(rows).build() to send. https://cloud.google.com/bigquery/streaming-data-into-bigquery#streaminginsertexamples
upvoted 24 times
...
TrueCurry
Highly Voted 2 years, 11 months ago
Selected Answer: B
Response should be A, because original code pushes one row at a time, which is more time consuming in contrast to batch processing. Proposed answer C is incorrect, because we still have more overhead in sending each row in separate request than using batch processing.
upvoted 6 times
...
d_ella2001
Most Recent 4 months, 2 weeks ago
Selected Answer: A
Correct answer A: Batching Rows: By batching multiple rows into a single request, you minimise the overhead of network communication and API call latency. BigQuery's InsertAllRequest supports inserting multiple rows in a single API call. Parallel Inserts (Option B): While parallel processing can improve performance, it introduces complexity and may require handling concurrency issues. It's generally better to batch rows first. Writing to Cloud Storage (Options C & D): Writing data to Cloud Storage and then loading it into BigQuery can be efficient for very large datasets, but it adds extra steps and complexity. It's more suitable for bulk load operations rather than small, frequent inserts.
upvoted 1 times
...
thewalker
4 months, 2 weeks ago
For smaller datasets or when simplicity is paramount: Including multiple rows with each request is often sufficient. For larger datasets or when performance is critical: Parallel inserts are the way to go.
upvoted 1 times
...
santoshchauhan
8 months, 3 weeks ago
Selected Answer: A
A. Include multiple rows with each request: This would be a very efficient way to batch the insert operations. BigQuery's insertAll method supports batched inserts, so instead of inserting each row in a separate request, you could group multiple rows into a single insertAll request. This approach reduces the number of HTTP requests made to the BigQuery service, which can improve throughput and reduce the risk of hitting rate limits.
upvoted 1 times
...
gingrick
1 year ago
Selected Answer: B
B - I was between A and B. Both options require changes in the code and Option B requires changes in the way you are managing the Collection. If you insert multiples rows at a time, you would still need to move through the ROWS in the collection one by one (remember, this is a loop) to then insert in bulk. If you first break the Collection into (n) subsets and then run the function in (n) threats, you would be moving through (n) subsets at a time, making (n) insertions at a time, all in parallel. That was my way of viewing it. Option A would actually not even make a change in performance (sort of), you would just be interacting with the database less. (if interacting less in faster then you would see a small decrease in insert latencies)
upvoted 1 times
...
__rajan__
1 year, 2 months ago
Selected Answer: A
I would go with A.
upvoted 1 times
...
Pime13
1 year, 9 months ago
Selected Answer: A
i'd choose A. for me it's same as batch insert/update recommended
upvoted 1 times
...
mrvergara
1 year, 9 months ago
Selected Answer: A
A. Include multiple rows with each request. Batch inserts are more efficient than individual inserts and will increase write performance by reducing the overhead of creating and sending individual requests for each row. Parallel inserts could potentially lead to conflicting writes or cause resource exhaustion, and adding a step of writing to Cloud Storage and then loading into BigQuery can add additional overhead and complexity.
upvoted 1 times
...
Foxal
1 year, 9 months ago
Selected Answer: A
A is the corret answer
upvoted 1 times
...
telp
1 year, 10 months ago
Selected Answer: A
answer A, biquery support multiple insert in one request https://cloud.google.com/bigquery/docs/samples/bigquery-table-insert-rows
upvoted 1 times
...
omermahgoub
1 year, 11 months ago
A. Include multiple rows with each request. It is generally more efficient to insert multiple rows in a single request, rather than making a separate request for each row. This reduces the overhead of making multiple HTTP requests, and can also improve performance by allowing BigQuery to perform more efficient batch operations. You can use the InsertAllRequest.RowToInsert.of(row) method to add multiple rows to a single request
upvoted 1 times
omermahgoub
1 year, 11 months ago
For example, you could modify the code to collect the rows in a list and insert them in batches: List<InsertAllRequest.RowToInsert> rowsToInsert = new ArrayList<>(); for (Map<String, String> row : rows) { rowsToInsert.add(InsertAllRequest.RowToInsert.of(row)); if (rowsToInsert.size() == BATCH_SIZE) { InsertAllRequest insertRequest = InsertAllRequest.newBuilder( "datasetId", "tableId", rowsToInsert).build(); service.insertAll(insertRequest); rowsToInsert.clear(); } } if (!rowsToInsert.isEmpty()) { InsertAllRequest insertRequest = InsertAllRequest.newBuilder( "datasetId", "tableId", rowsToInsert).build(); service.insertAll(insertRequest); } This will insert the rows in batches of BATCH_SIZE, which you can adjust based on the desired balance between performance and resource usage.
upvoted 1 times
omermahgoub
1 year, 11 months ago
Options B and D, which involve using multiple threads to perform the inserts or write the rows to Cloud Storage, may not necessarily improve the efficiency of the code. These options could potentially increase the complexity of the code and introduce additional overhead, without necessarily improving the performance of the inserts. Option C, writing each row to a Cloud Storage object before loading into BigQuery, would likely be less efficient than simply inserting the rows directly into BigQuery. It would involve additional steps and potentially increase the overall time it takes to write the rows to the table.
upvoted 1 times
...
...
...
test010101
1 year, 11 months ago
Selected Answer: A
vote A
upvoted 1 times
...
jcataluna
1 year, 12 months ago
Selected Answer: A
Original code inserts one row at a time so no point on using parallel requests..
upvoted 1 times
...
thaipad
2 years, 2 months ago
Selected Answer: A
Parallel saving to the database can increase the total addition time and depends on many system conditions. While batch saving is optimized at the database core level.
upvoted 1 times
...
tomato123
2 years, 3 months ago
Selected Answer: B
B is correct
upvoted 2 times
...
kinoko1330
2 years, 3 months ago
Selected Answer: A
https://cloud.google.com/bigquery/docs/samples/bigquery-table-insert-rows
upvoted 1 times
...
Community vote distribution
A (35%)
C (25%)
B (20%)
Other
Most Voted
A voting comment increases the vote count for the chosen answer by one.

Upvoting a comment with a selected answer will also increase the vote count towards that answer by one. So if you see a comment that you already agree with, you can upvote it instead of posting a new comment.

SaveCancel
Loading ...