exam questions

Exam Professional Data Engineer All Questions

View all questions & answers for the Professional Data Engineer exam

Exam Professional Data Engineer topic 1 question 183 discussion

Actual exam question from Google's Professional Data Engineer
Question #: 183
Topic #: 1
[All Professional Data Engineer Questions]

You are using Bigtable to persist and serve stock market data for each of the major indices. To serve the trading application, you need to access only the most recent stock prices that are streaming in. How should you design your row key and tables to ensure that you can access the data with the simplest query?

  • A. Create one unique table for all of the indices, and then use the index and timestamp as the row key design.
  • B. Create one unique table for all of the indices, and then use a reverse timestamp as the row key design.
  • C. For each index, have a separate table and use a timestamp as the row key design.
  • D. For each index, have a separate table and use a reverse timestamp as the row key design.
Show Suggested Answer Hide Answer
Suggested Answer: B 🗳️

Comments

Chosen Answer:
This is a voting comment (?). It is better to Upvote an existing comment if you don't have anything to add.
Switch to a voting comment New
John_Pongthorn
Highly Voted 2 years, 3 months ago
This is special case , plese Take a look carefully the below link and read at last paragraph at the bottom of this comment, let everyone share idea, We will go with B, C https://cloud.google.com/bigtable/docs/schema-design#time-based Don't use a timestamp by itself or at the beginning of a row key, because this will cause sequential writes to be pushed onto a single node, creating a hotspot. If you usually retrieve the most recent records first, you can use a reversed timestamp in the row key by subtracting the timestamp from your programming language's maximum value for long integers (in Java, java.lang.Long.MAX_VALUE). With a reversed timestamp, the records will be ordered from most recent to least recent.
upvoted 17 times
Mcloudgirl
2 years, 1 month ago
I agree, based on the docs, B. Leading with a non-reversed timestamp will lead to hotspotting, reversed is the way to go.
upvoted 2 times
...
...
zellck
Highly Voted 2 years, 1 month ago
Selected Answer: B
B is the answer. https://cloud.google.com/bigtable/docs/schema-design#time-based If you usually retrieve the most recent records first, you can use a reversed timestamp in the row key by subtracting the timestamp from your programming language's maximum value for long integers (in Java, java.lang.Long.MAX_VALUE). With a reversed timestamp, the records will be ordered from most recent to least recent.
upvoted 12 times
...
shangning007
Most Recent 2 weeks ago
Selected Answer: D
I don't think any answer is correct. A lot people upvote for B, but based on https://cloud.google.com/bigtable/docs/schema-design#time-based, "As with any timestamp, avoid starting a row key with a reversed timestamp so that you don't cause hotspots."
upvoted 2 times
...
ToiToi
2 months ago
Selected Answer: D
Why other options are not as suitable: A and B (One table for all indices): Storing all indices in a single table can lead to performance issues as the table grows larger. It also makes it harder to scale individual indices independently. C (Timestamp as row key): Using a regular timestamp would place the most recent data at the end of the table, making it less efficient to retrieve the latest prices.
upvoted 2 times
...
SamuelTsch
2 months ago
Selected Answer: D
Option B and Option D are both from my point of view correct. It depens on the situation. If there is need to get the information from each stock index, then D is more suitable. Otherwise B.
upvoted 2 times
...
mayankazyour
3 months, 4 weeks ago
Selected Answer: D
1. Reverse Timestamp for most recent stock prices 2. Having different table for each stock is more efficient, improves the query performance and option B doesn't specify stock in row key.
upvoted 2 times
...
iooj
5 months ago
Selected Answer: A
Row keys that start with a timestamp (irrespective reversed or not) causes sequential writes to be pushed onto a single node, creating a hotspot. If you put a timestamp in a row key, precede it with a high-cardinality value (index in our case) to avoid hotspots. The ideal option would be: "use the index and reversed timestamp as the row key design".
upvoted 4 times
...
datapassionate
11 months, 3 weeks ago
Selected Answer: B
B is a correct answer because "you need to access only the most recent stock prices" "If you usually retrieve the most recent records first, you can use a reversed timestamp in the row key by subtracting the timestamp from your programming language's maximum value for long integers (in Java, java.lang.Long.MAX_VALUE). With a reversed timestamp, the records will be ordered from most recent to least recent." https://cloud.google.com/bigtable/docs/schema-design#time-based
upvoted 4 times
...
Selected Answer: B
B. One unique table for all indices, reverse timestamp as row key: A single table for all indices keeps the structure simple. Using a reverse timestamp as part of the row key ensures that the most recent data comes first in the sorted order. This design is beneficial for quickly accessing the latest data. For example: you can convert the timestamp to a string and format it in reverse order, like "yyyyMMddHHmmss", ensuring newer dates and times are sorted lexicographically before older ones.
upvoted 2 times
...
kshehadyx
1 year, 3 months ago
Correct Is B
upvoted 1 times
...
arien_chen
1 year, 4 months ago
Selected Answer: D
Option B using reverse timestamp only, this is not the answer. the right answer should be using the index and revers timestamp as the row key. So, Option D is the only answer, because not A,B,C .
upvoted 6 times
...
Lanro
1 year, 5 months ago
Selected Answer: B
https://cloud.google.com/bigtable/docs/schema-design#row-keys - If you usually retrieve the most recent records first, you can use a reversed timestamp B it is.
upvoted 1 times
...
Chom
1 year, 5 months ago
Selected Answer: A
A is the answer
upvoted 2 times
...
vaga1
1 year, 6 months ago
Selected Answer: B
the answer relieves on whether the application need to access the whole indexes at the same time or not. If yes then is B, if no is A. in mind the answer is yes, so B makes more sense: I retrieve all the list at the same time.
upvoted 1 times
...
ajdf
1 year, 6 months ago
Selected Answer: B
https://cloud.google.com/bigtable/docs/schema-design#time-based If you usually retrieve the most recent records first, you can use a reversed timestamp in the row key by subtracting the timestamp from your programming language's maximum value for long integers (in Java, java.lang.Long.MAX_VALUE). With a reversed timestamp, the records will be ordered from most recent to least recent.
upvoted 1 times
...
WillemHendr
1 year, 6 months ago
Selected Answer: B
"access the data with the simplest query"
upvoted 1 times
...
Prudvi3266
1 year, 8 months ago
Selected Answer: A
yes reverse time stamp is recommended to prevent hot spot. But our query pattern is we need most recent record the is easy when you use Timestamp and Also option a stating that our row key not starting with time stamp which is index#timestamp and which is the most efficient way for this scenario.
upvoted 4 times
...
Community vote distribution
A (35%)
C (25%)
B (20%)
Other
Most Voted
A voting comment increases the vote count for the chosen answer by one.

Upvoting a comment with a selected answer will also increase the vote count towards that answer by one. So if you see a comment that you already agree with, you can upvote it instead of posting a new comment.

SaveCancel
Loading ...
exam
Someone Bought Contributor Access for:
SY0-701
London, 1 minute ago