Exam Cloud Digital Leader topic 1 question 226 discussion

Actual exam question from Google's Cloud Digital Leader

Question #: 226
Topic #: 1

An organization wants to build a data pipeline to transform its data so it can be reconciled in a data warehouse. The solution must be scalable and require little or no management. Which Google product or service should the organization choose?

A. Cloud Bigtable
B. Cloud Storage
C. Pub/Sub
D. Dataflow

Show Suggested Answer

Suggested Answer: D 🗳️

by Vivek007 at April 30, 2024, 7:08 a.m.

Comments

Submit Cancel

joshnort

12 hours, 7 minutes ago

Selected Answer: D

D. Dataflow Dataflow provides unified stream and batch data processing at scale. Use it to create data pipelines that read from one or more sources, transform the data, & write the data to a destination. Typical use cases for Dataflow: - Data movement: Ingesting data or replicating data across subsystems. - ETL (extract-transform-load) workflows that ingest data into a data warehouse such as BigQuery. - Powering BI dashboards. - Applying ML in real time to streaming data. - Processing sensor data or log data at scale. Dataflow uses the same programming model for both batch and stream analytics. You can ingest, process, and analyze fluctuating volumes of real-time data. https://cloud.google.com/dataflow/docs/overview

upvoted 1 times

joshnort

12 hours, 6 minutes ago

Incorrect options: A. Cloud Bigtable: Cloud Bigtable is a NoSQL database service designed for storing large amounts of data with low-latency access. While it is useful for certain types of data storage and analysis, it is not specifically designed for building and managing data transformation pipelines. B. Cloud Storage: Cloud Storage is a scalable object storage service, ideal for storing large volumes of unstructured data. However, it does not offer direct functionality for transforming data or building data pipelines; additional tools or services like Dataflow would be required to transform data stored in Cloud Storage. C. Pub/Sub: Pub/Sub is a messaging service used for building event-driven systems and real-time messaging. While it is useful for ingesting data into a pipeline, it does not provide the data transformation or reconciliation capabilities needed for this use case. It is often used in combination with other services like Dataflow.

upvoted 1 times

...

Vivek007

2 months, 2 weeks ago

D: Google Cloud Dataflow is a fully-managed service designed to simplify the tasks of processing large amounts of data in both batch and stream modes. Dataflow is especially suited for data integration and ETL (extract, transform, load) tasks, making it ideal for preparing data for a data warehouse. It is built on Apache Beam, providing a unified programming model to define data processing pipelines. The scalable nature of Dataflow means it can adjust resource allocation dynamically based on the workload, and its managed service ensures that the overhead of managing server infrastructure is minimal.

upvoted 3 times

...

Exam Cloud Digital Leader All Questions

View all questions & answers for the Cloud Digital Leader exam

Exam Cloud Digital Leader topic 1 question 226 discussion

Comments

joshnort

joshnort

Vivek007

SY0-701