Welcome to ExamTopics
ExamTopics Logo
- Expert Verified, Online, Free.
exam questions

Exam Professional Cloud Architect All Questions

View all questions & answers for the Professional Cloud Architect exam

Exam Professional Cloud Architect topic 9 question 4 discussion

Actual exam question from Google's Professional Cloud Architect
Question #: 4
Topic #: 9
[All Professional Cloud Architect Questions]

For this question, refer to the TerramEarth case study. A new architecture that writes all incoming data to BigQuery has been introduced. You notice that the data is dirty, and want to ensure data quality on an automated daily basis while managing cost.
What should you do?

  • A. Set up a streaming Cloud Dataflow job, receiving data by the ingestion process. Clean the data in a Cloud Dataflow pipeline.
  • B. Create a Cloud Function that reads data from BigQuery and cleans it. Trigger the Cloud Function from a Compute Engine instance.
  • C. Create a SQL statement on the data in BigQuery, and save it as a view. Run the view daily, and save the result to a new table.
  • D. Use Cloud Dataprep and configure the BigQuery tables as the source. Schedule a daily job to clean the data.
Show Suggested Answer Hide Answer
Suggested Answer: D 🗳️

Comments

Chosen Answer:
This is a voting comment (?) , you can switch to a simple comment.
Switch to a voting comment New
Sj10
Highly Voted 4 years, 9 months ago
Option D, as data needs to be cleaned .. Dataprep has the capabilities to clean dirty data
upvoted 34 times
tartar
4 years, 3 months ago
D is ok
upvoted 13 times
...
melono
2 years, 1 month ago
looks like D https://cloud.google.com/dataprep
upvoted 2 times
...
motty
4 years, 5 months ago
dataprep is GUI driven process to analyse adhoc data dumped on GCS, it has not place in this use case
upvoted 5 times
...
...
vindahake
Highly Voted 4 years, 8 months ago
automated daily ... answer is D
upvoted 12 times
...
odacir
Most Recent 1 year ago
Selected Answer: D
Cloud Dataprep is not cheap. Today i will recommend to used a schedule DataForm or dbt for cleaning...
upvoted 1 times
...
red_panda
1 year, 4 months ago
Selected Answer: D
D without any doubt. Dataflow is for data elaboration. Dataprep is for data preparation (and cleaning).
upvoted 2 times
...
RVivek
1 year, 9 months ago
Selected Answer: D
B & C does not make sense. A is costly and in realtime The question says on daily basis and cost effective hence D
upvoted 4 times
...
surajkrishnamurthy
1 year, 11 months ago
Selected Answer: D
D is the correct answer
upvoted 2 times
...
megumin
2 years ago
Selected Answer: D
D is ok
upvoted 1 times
...
cbarg
2 years, 2 months ago
Selected Answer: D
Ans is D. Please refer to this example: https://medium.com/google-cloud/how-to-schedule-a-bigquery-etl-job-with-dataprep-b1c314883ab9
upvoted 4 times
...
ShadowLord
2 years, 2 months ago
Selected Answer: A
Options should be A. 1. Cost in D would be higher. e.g. First load dirty data into DB and then run Data Prep Jobs to clean the data and load into some different target Data . Overall cost of scanning the data and the loading is like double the cost. Then identifying already clean data and dirty data is again a challenge on daily basis after the data growth is significant 2. Data Stream can be utilized to cleanse the data while loading
upvoted 5 times
dayody
2 years, 2 months ago
you cannot clean data with Dataflow only with Dataprep
upvoted 2 times
Begum
2 months ago
Why not ?? we have done it using both....
upvoted 1 times
...
...
...
DrishaS4
2 years, 3 months ago
Selected Answer: D
automated daily ... answer is D
upvoted 3 times
...
AzureDP900
2 years, 4 months ago
D is perfect to cleanup the data daily!
upvoted 2 times
...
vincy2202
2 years, 11 months ago
Selected Answer: D
D is the correct answer
upvoted 1 times
...
pakilodi
2 years, 11 months ago
Selected Answer: D
Vote D
upvoted 1 times
...
joe2211
2 years, 12 months ago
Selected Answer: D
vote D
upvoted 1 times
...
gonzalopf94
3 years ago
Option is A, Dataprep uses a UI to perform the cleaning process and under the hood it is using Dataflow to perform the process, so I will go with A.
upvoted 4 times
...
[Removed]
3 years, 1 month ago
A and D are both will solve the purpose. A is more expensive and ask is daily basis clean-up of data. D is right choice.
upvoted 2 times
...
ZappsterB
3 years, 1 month ago
Should be A. Dirty data may not be formatted to suit the table structure and then won't go in to be 'cleansed' later.
upvoted 2 times
...
Community vote distribution
A (35%)
C (25%)
B (20%)
Other
Most Voted
A voting comment increases the vote count for the chosen answer by one.

Upvoting a comment with a selected answer will also increase the vote count towards that answer by one. So if you see a comment that you already agree with, you can upvote it instead of posting a new comment.

SaveCancel
Loading ...