Exam Professional Cloud Developer All Questions

View all questions & answers for the Professional Cloud Developer exam

Exam Professional Cloud Developer topic 1 question 13 discussion

Actual exam question from Google's Professional Cloud Developer

Question #: 13
Topic #: 1

[All Professional Cloud Developer Questions]

Your teammate has asked you to review the code below. Its purpose is to efficiently add a large number of small rows to a BigQuery table.

Which improvement should you suggest your teammate make?

A. Include multiple rows with each request.
B. Perform the inserts in parallel by creating multiple threads.
C. Write each row to a Cloud Storage object, then load into BigQuery.
D. Write each row to a Cloud Storage object in parallel, then load into BigQuery.

Show Suggested Answer

Suggested Answer: A 🗳️

by fraloca at Dec. 14, 2020, 7:20 p.m.

Comments

Submit Cancel

fraloca

Highly Voted 3 years, 11 months ago

For me the correct answer is A. Infact the loop build a single InsertReqeust and send it. But we can build all request in a list and use InsertAllRequest.newBuilder(tableId).setRows(rows).build() to send. https://cloud.google.com/bigquery/streaming-data-into-bigquery#streaminginsertexamples

upvoted 24 times

...

TrueCurry

Highly Voted 2 years, 11 months ago

Selected Answer: B

Response should be A, because original code pushes one row at a time, which is more time consuming in contrast to batch processing. Proposed answer C is incorrect, because we still have more overhead in sending each row in separate request than using batch processing.

upvoted 6 times

...

d_ella2001

Most Recent 4 months, 2 weeks ago

Selected Answer: A

Correct answer A: Batching Rows: By batching multiple rows into a single request, you minimise the overhead of network communication and API call latency. BigQuery's InsertAllRequest supports inserting multiple rows in a single API call. Parallel Inserts (Option B): While parallel processing can improve performance, it introduces complexity and may require handling concurrency issues. It's generally better to batch rows first. Writing to Cloud Storage (Options C & D): Writing data to Cloud Storage and then loading it into BigQuery can be efficient for very large datasets, but it adds extra steps and complexity. It's more suitable for bulk load operations rather than small, frequent inserts.

upvoted 1 times

...

thewalker

4 months, 2 weeks ago

For smaller datasets or when simplicity is paramount: Including multiple rows with each request is often sufficient. For larger datasets or when performance is critical: Parallel inserts are the way to go.

upvoted 1 times

...

santoshchauhan

8 months, 3 weeks ago

Selected Answer: A

A. Include multiple rows with each request: This would be a very efficient way to batch the insert operations. BigQuery's insertAll method supports batched inserts, so instead of inserting each row in a separate request, you could group multiple rows into a single insertAll request. This approach reduces the number of HTTP requests made to the BigQuery service, which can improve throughput and reduce the risk of hitting rate limits.

upvoted 1 times

...

gingrick

1 year ago

Selected Answer: B

B - I was between A and B. Both options require changes in the code and Option B requires changes in the way you are managing the Collection. If you insert multiples rows at a time, you would still need to move through the ROWS in the collection one by one (remember, this is a loop) to then insert in bulk. If you first break the Collection into (n) subsets and then run the function in (n) threats, you would be moving through (n) subsets at a time, making (n) insertions at a time, all in parallel. That was my way of viewing it. Option A would actually not even make a change in performance (sort of), you would just be interacting with the database less. (if interacting less in faster then you would see a small decrease in insert latencies)

upvoted 1 times

...

rajan

1 year, 2 months ago

Selected Answer: A

I would go with A.

upvoted 1 times

...

Pime13

1 year, 9 months ago

Selected Answer: A

i'd choose A. for me it's same as batch insert/update recommended

upvoted 1 times

...

mrvergara

1 year, 9 months ago

Selected Answer: A

A. Include multiple rows with each request. Batch inserts are more efficient than individual inserts and will increase write performance by reducing the overhead of creating and sending individual requests for each row. Parallel inserts could potentially lead to conflicting writes or cause resource exhaustion, and adding a step of writing to Cloud Storage and then loading into BigQuery can add additional overhead and complexity.

upvoted 1 times

...

Foxal

1 year, 9 months ago

Selected Answer: A

A is the corret answer

upvoted 1 times

...

telp

1 year, 10 months ago

Selected Answer: A

answer A, biquery support multiple insert in one request https://cloud.google.com/bigquery/docs/samples/bigquery-table-insert-rows

upvoted 1 times

...

omermahgoub

1 year, 11 months ago

A. Include multiple rows with each request. It is generally more efficient to insert multiple rows in a single request, rather than making a separate request for each row. This reduces the overhead of making multiple HTTP requests, and can also improve performance by allowing BigQuery to perform more efficient batch operations. You can use the InsertAllRequest.RowToInsert.of(row) method to add multiple rows to a single request

upvoted 1 times

omermahgoub

1 year, 11 months ago

For example, you could modify the code to collect the rows in a list and insert them in batches: List<InsertAllRequest.RowToInsert> rowsToInsert = new ArrayList<>(); for (Map<String, String> row : rows) { rowsToInsert.add(InsertAllRequest.RowToInsert.of(row)); if (rowsToInsert.size() == BATCH_SIZE) { InsertAllRequest insertRequest = InsertAllRequest.newBuilder( "datasetId", "tableId", rowsToInsert).build(); service.insertAll(insertRequest); rowsToInsert.clear(); } } if (!rowsToInsert.isEmpty()) { InsertAllRequest insertRequest = InsertAllRequest.newBuilder( "datasetId", "tableId", rowsToInsert).build(); service.insertAll(insertRequest); } This will insert the rows in batches of BATCH_SIZE, which you can adjust based on the desired balance between performance and resource usage.

upvoted 1 times

omermahgoub

1 year, 11 months ago

Options B and D, which involve using multiple threads to perform the inserts or write the rows to Cloud Storage, may not necessarily improve the efficiency of the code. These options could potentially increase the complexity of the code and introduce additional overhead, without necessarily improving the performance of the inserts. Option C, writing each row to a Cloud Storage object before loading into BigQuery, would likely be less efficient than simply inserting the rows directly into BigQuery. It would involve additional steps and potentially increase the overall time it takes to write the rows to the table.

upvoted 1 times

...

test010101

1 year, 11 months ago

Selected Answer: A

vote A

upvoted 1 times

...

jcataluna

1 year, 12 months ago

Selected Answer: A

Original code inserts one row at a time so no point on using parallel requests..

upvoted 1 times

...

thaipad

2 years, 2 months ago

Selected Answer: A

Parallel saving to the database can increase the total addition time and depends on many system conditions. While batch saving is optimized at the database core level.

upvoted 1 times

...

tomato123

2 years, 3 months ago

Selected Answer: B

B is correct

upvoted 2 times

...

kinoko1330

2 years, 3 months ago

Selected Answer: A

https://cloud.google.com/bigquery/docs/samples/bigquery-table-insert-rows

upvoted 1 times

...

Load full discussion...

Exam Professional Cloud Developer All Questions

View all questions & answers for the Professional Cloud Developer exam

Exam Professional Cloud Developer topic 1 question 13 discussion

Comments

fraloca

TrueCurry

d_ella2001

thewalker

santoshchauhan

gingrick

rajan

Pime13

mrvergara

Foxal

telp

omermahgoub

omermahgoub

omermahgoub

test010101

jcataluna

thaipad

tomato123

kinoko1330

Get IT Certification

New Version GCP Professional Cloud Architect Certificate & Helpful Information

The 5 Most In-Demand Project Management Certifications of 2019

Exam Professional Cloud Developer All Questions

View all questions & answers for the Professional Cloud Developer exam

Exam Professional Cloud Developer topic 1 question 13 discussion

Comments

fraloca

TrueCurry

d_ella2001

thewalker

santoshchauhan

gingrick

__rajan__

Pime13

mrvergara

Foxal

telp

omermahgoub

omermahgoub

omermahgoub

test010101

jcataluna

thaipad

tomato123

kinoko1330

Get IT Certification

rajan