exam questions

Exam Professional Data Engineer All Questions

View all questions & answers for the Professional Data Engineer exam

Exam Professional Data Engineer topic 1 question 213 discussion

Actual exam question from Google's Professional Data Engineer
Question #: 213
Topic #: 1
[All Professional Data Engineer Questions]

Your company's customer_order table in BigQuery stores the order history for 10 million customers, with a table size of 10 PB. You need to create a dashboard for the support team to view the order history. The dashboard has two filters, country_name and username. Both are string data types in the BigQuery table. When a filter is applied, the dashboard fetches the order history from the table and displays the query results. However, the dashboard is slow to show the results when applying the filters to the following query:



How should you redesign the BigQuery table to support faster access?

  • A. Cluster the table by country and username fields.
  • B. Cluster the table by country field, and partition by username field.
  • C. Partition the table by country and username fields.
  • D. Partition the table by _PARTITIONTIME.
Show Suggested Answer Hide Answer
Suggested Answer: A 🗳️

Comments

Chosen Answer:
This is a voting comment (?). It is better to Upvote an existing comment if you don't have anything to add.
Switch to a voting comment New
Ryannn23
1 week ago
Selected Answer: A
Partition on String is not available in BQ - excludes B and C Partition by ingest time not useful as the query is filtering on 2 other columns - excludes D Correct answer A: cluster on both fields.
upvoted 1 times
...
niujo
5 months ago
Why not D? if u partition by date and int going to be the best option??
upvoted 1 times
b3e59c2
1 month ago
Because our query filtering is relating to country and user, and nothing to do with time. A partition by time will provide no performance increase in this case.
upvoted 1 times
...
...
JyoGCP
5 months, 3 weeks ago
Selected Answer: A
If country is represented by an integer code, then partition by country and cluster by username would be a better solution. As country code is a string, available best solution is "A. Cluster the table by country and username fields."
upvoted 3 times
...
datapassionate
6 months, 3 weeks ago
Selected Answer: A
Correct answer: A. Cluster the table by country and username fields. Why not B and C - > Intiger is required for partitioning https://cloud.google.com/bigquery/docs/partitioned-tables#integer_range
upvoted 4 times
...
Matt_108
6 months, 3 weeks ago
Selected Answer: A
A: the fields are both strings, which are not supported for partitioning. Moreover, the fields are regularly used in filters, which is where clustering really improves performance
upvoted 3 times
SanjeevRoy91
4 months, 2 weeks ago
Is not mandatory to have partitioning for clustering?
upvoted 1 times
...
...
Takshashila
7 months ago
Selected Answer: B
Clustering can also be done after partiton?
upvoted 1 times
chambg
5 months ago
Yes but the partition is done on username field which has 10 million values. Since a BQ table can only have 4000 it is not suitable
upvoted 2 times
...
...
raaad
7 months, 1 week ago
Selected Answer: A
- Clustering organizes the data based on the specified columns (in this case, country_name and username). - When a query filters on these columns, BigQuery can efficiently scan only the relevant parts of the table
upvoted 4 times
...
e70ea9e
7 months, 1 week ago
Selected Answer: A
country and username --> cluster
upvoted 3 times
...
Community vote distribution
A (35%)
C (25%)
B (20%)
Other
Most Voted
A voting comment increases the vote count for the chosen answer by one.

Upvoting a comment with a selected answer will also increase the vote count towards that answer by one. So if you see a comment that you already agree with, you can upvote it instead of posting a new comment.

SaveCancel
Loading ...
exam
Someone Bought Contributor Access for:
SY0-701
London, 1 minute ago