exam questions

Exam Professional Cloud Database Engineer All Questions

View all questions & answers for the Professional Cloud Database Engineer exam

Exam Professional Cloud Database Engineer topic 1 question 84 discussion

Actual exam question from Google's Professional Cloud Database Engineer
Question #: 84
Topic #: 1
[All Professional Cloud Database Engineer Questions]

Your company is using Cloud SQL for MySQL with an internal (private) IP address and wants to replicate some tables into BigQuery in near-real time for analytics and machine learning. You need to ensure that replication is fast and reliable and uses Google-managed services. What should you do?

  • A. Develop a custom data replication service to send data into BigQuery.
  • B. Use Cloud SQL federated queries.
  • C. Use Database Migration Service to replicate tables into BigQuery.
  • D. Use Datastream to capture changes, and use Dataflow to write those changes to BigQuery.
Show Suggested Answer Hide Answer
Suggested Answer: D 🗳️

Comments

Chosen Answer:
This is a voting comment (?). It is better to Upvote an existing comment if you don't have anything to add.
Switch to a voting comment New
dynamic_dba
Highly Voted 1 year, 7 months ago
D. Anytime you see words like “develop” or “manually” be suspicious given this is cloud and everything is supposed to be automated and point-and-click easy. Eliminate A. Federated queries are SQL queries initiated FROM BigQuery to Cloud Spanner or Cloud SQL databases. So B doesn’t make sense. The Database Migration Service does not support BigQuery as a destination database engine. Eliminate C. That leaves D. From Google’s documentation, “Datastream is a serverless and easy-to-use Change Data Capture (CDC) and replication service that allows you to synchronize data across heterogeneous databases, storage systems, and applications reliably and with minimal latency. Datastream supports change data streaming from Oracle and MySQL databases to Google Cloud Storage (GCS). The service offers streamlined integration with Dataflow templates to power up to date materialized views in BigQuery for analytics, replicate their databases into Cloud SQL or Cloud Spanner for database synchronization, or leverage the event stream directly from GCS to realize event-driven architectures.”
upvoted 12 times
...
dija123
Most Recent 5 months ago
Selected Answer: D
Agree with D
upvoted 1 times
...
Pime13
11 months, 4 weeks ago
Selected Answer: D
D because we need replication: As a data analyst, you can query data in Cloud SQL from BigQuery using federated queries (...) Alternatively, to replicate data into BigQuery, you can also use Cloud Data Fusion or Datastream. Datastream is a serverless and easy-to-use change data capture (CDC) and replication service that lets you synchronize data reliably, and with minimal latency. Datastream provides seamless replication of data from operational databases into BigQuery. reference: https://cloud.google.com/bigquery/docs/cloud-sql-federated-queries https://cloud.google.com/datastream/docs/overview
upvoted 1 times
...
pico
1 year, 2 months ago
You can not connect Datastream directly to Cloud SQL with an internal IP without using A compute instance where a SQL Proxy is deployed to bridge the traffic between Datastream and Cloud SQL. Because connecting to Cloud SQL from Datastream is not possible https://github.com/rocketechgroup/mysql-to-bq-datastream
upvoted 2 times
...
pico
1 year, 2 months ago
B As a data analyst, you can query data in Cloud SQL from BigQuery using federated queries. BigQuery Cloud SQL federation enables BigQuery to query data residing in Cloud SQL in real time, without copying or moving data. Query federation supports both MySQL (2nd generation) and PostgreSQL instances in Cloud SQL. Alternatively, to replicate data into BigQuery, you can also use Cloud Data Fusion or Datastream. For more about using Cloud Data Fusion, see Replicating data from MySQL to BigQuery. https://cloud.google.com/bigquery/docs/cloud-sql-federated-queries
upvoted 1 times
...
chelbsik
1 year, 10 months ago
Selected Answer: D
I'll go for D
upvoted 2 times
...
pk349
1 year, 10 months ago
D: Use Datastream to capture *** changes, and use Dataflow to write those changes to BigQuery. Dataflow is a fully managed streaming analytics service that minimizes latency, processing time, and cost through autoscaling and batch processing. Dataflow is a managed service for executing a wide variety of data processing patterns. The documentation on this site shows you how to deploy your batch and streaming data processing pipelines using Dataflow, including directions for using service features.
upvoted 3 times
sp57
1 year, 10 months ago
https://cloud.google.com/datastream-for-bigquery
upvoted 1 times
sp57
1 year, 10 months ago
Linked article confirms datastream + dataflow is a "thing". Provides additional customization vs just datastream.
upvoted 1 times
...
...
...
Community vote distribution
A (35%)
C (25%)
B (20%)
Other
Most Voted
A voting comment increases the vote count for the chosen answer by one.

Upvoting a comment with a selected answer will also increase the vote count towards that answer by one. So if you see a comment that you already agree with, you can upvote it instead of posting a new comment.

SaveCancel
Loading ...
exam
Someone Bought Contributor Access for:
SY0-701
London, 1 minute ago