Welcome to ExamTopics
ExamTopics Logo
- Expert Verified, Online, Free.
exam questions

Exam Professional Data Engineer All Questions

View all questions & answers for the Professional Data Engineer exam

Exam Professional Data Engineer topic 1 question 65 discussion

Actual exam question from Google's Professional Data Engineer
Question #: 65
Topic #: 1
[All Professional Data Engineer Questions]

You are building a data pipeline on Google Cloud. You need to prepare data using a casual method for a machine-learning process. You want to support a logistic regression model. You also need to monitor and adjust for null values, which must remain real-valued and cannot be removed. What should you do?

  • A. Use Cloud Dataprep to find null values in sample source data. Convert all nulls to 'none' using a Cloud Dataproc job.
  • B. Use Cloud Dataprep to find null values in sample source data. Convert all nulls to 0 using a Cloud Dataprep job.
  • C. Use Cloud Dataflow to find null values in sample source data. Convert all nulls to 'none' using a Cloud Dataprep job.
  • D. Use Cloud Dataflow to find null values in sample source data. Convert all nulls to 0 using a custom script.
Show Suggested Answer Hide Answer
Suggested Answer: B 🗳️

Comments

Chosen Answer:
This is a voting comment (?) , you can switch to a simple comment.
Switch to a voting comment New
jvg637
Highly Voted 4 years, 8 months ago
real-valued can not be null N/A or empty, have to be “0”, so it has to be B.
upvoted 39 times
...
[Removed]
Highly Voted 4 years, 8 months ago
Should be B
upvoted 16 times
...
Erg_de
Most Recent 2 weeks, 5 days ago
Selected Answer: D
Option D: Using null value conversion to 0 is the most correct practice for this case. Accompanying it with a script allows us to implement the necessary logic to handle null cases properly, adapting to the model while maintaining data integrity.
upvoted 1 times
...
AjoeT
7 months, 3 weeks ago
Selected Answer: B
B. Dataprep has the feature to convert it into 0.
upvoted 1 times
...
niru12376
8 months, 3 weeks ago
0 is still a value, which can add bias in the model and the model will take that into account while making predictions so 'none'
upvoted 1 times
...
Nandababy
11 months, 2 weeks ago
Why not D? keyword is Monitor, B would replace all empty fields and also cause unintended bias.
upvoted 1 times
Nandababy
11 months, 2 weeks ago
However, Sergiomujica is right. If we need to prepare data using a casual method then its B "Dataprep".
upvoted 1 times
...
...
sergiomujica
1 year, 2 months ago
The questions says "You need to prepare data using a casual method ", thats dataprep and values should be 0 so the right answer is B
upvoted 1 times
...
Mathew106
1 year, 4 months ago
Selected Answer: B
No brainer. We need a real value and Dataprep is made for this. Dataflow is mainly for pre-processing before BigQuery ingests the data.
upvoted 1 times
...
theseawillclaim
1 year, 4 months ago
Selected Answer: B
Dataprep is made for this kind of stuff, no reason to use a streaming service such as Dataflow.
upvoted 1 times
...
Oleksandr0501
1 year, 7 months ago
Selected Answer: B
gpt:Cloud Dataprep is a data preparation service that can be used to transform, clean and shape data in a visually interactive way. It provides an easy-to-use interface to find and replace null values. Cloud Dataflow is a fully-managed service for executing data processing pipelines, which allows for parallel execution of data processing tasks. However, it requires more expertise to set up and operate than Cloud Dataprep, and is usually used for more complex data processing needs. Therefore, option B is the most suitable approach for the given requirements.
upvoted 1 times
...
samdhimal
1 year, 10 months ago
Seems to me like Answers are both B and D. B is faster to implement while D takes time. Doesnt mean that it's wrong though. I m not sure why everyone has picked just B. Why not D? D works and does the same job. And also having custom script provides more flexibility and control over the data processing tasks and it allows you to handle missing values in a more flexible and efficient way.
upvoted 2 times
rajm893
1 year, 6 months ago
The "casual way" or easy way to convert to to 0 is using Dataprep job rather than using the custom script.
upvoted 1 times
...
AmmarFasih
1 year, 6 months ago
A simple rule. Whenever any service is available by GCP for a task, always recommend to use GCP service over any other.
upvoted 1 times
...
...
GCPpro
1 year, 10 months ago
B is the correct answer.
upvoted 1 times
...
AzureDP900
1 year, 10 months ago
Answer is Use Cloud Dataprep to find null values in sample source data. Convert all nulls to 0 using a Cloud Dataprep job. Key phrases are "casual method", "need to replace null with real values", "logistic regression". Logistic regression works on numbers. Null need to be replaced with a number. And Cloud dataprep is best casual tool out of given options.
upvoted 3 times
...
DGames
1 year, 11 months ago
Selected Answer: B
real value 0
upvoted 1 times
...
byash1
2 years, 10 months ago
Selected Answer: B
It is B
upvoted 2 times
...
medeis_jar
2 years, 10 months ago
Selected Answer: B
Dataprep + real value (0)
upvoted 1 times
...
MaxNRG
2 years, 11 months ago
Selected Answer: B
Dataprep is the tool. A or B. Since they need to have a real-valued cannot be null N/A or empty, have to be “0”, so it has to be B.
upvoted 3 times
...
Community vote distribution
A (35%)
C (25%)
B (20%)
Other
Most Voted
A voting comment increases the vote count for the chosen answer by one.

Upvoting a comment with a selected answer will also increase the vote count towards that answer by one. So if you see a comment that you already agree with, you can upvote it instead of posting a new comment.

SaveCancel
Loading ...