In the medallion architecture, Bronze tables are the first stage in the data pipeline and directly represent raw data ingested into the system. The raw data is stored in its original form but typically has a schema applied to make it queryable and usable within a structured data processing framework like Delta Lake.
Why E is correct:
Bronze tables contain the raw data as-is but with a defined schema to enable easier downstream processing and integration.
This schema provides structure to the otherwise unstructured or semi-structured raw data.
E. Bronze tables contain raw data with a schema applied.
In a typical data processing pipeline following a "Bronze-Silver-Gold" data lakehouse architecture, Bronze tables are the initial stage where raw data is ingested and transformed into a structured format with a schema applied. The schema provides structure and meaning to the raw data, making it more usable and accessible for downstream processing.
Therefore, Bronze tables contain the raw data but in a structured and schema-enforced format, which makes them distinct from the unprocessed, unstructured raw data files.
Ans : E
The Bronze layer is where we land all the data from external source systems. The table structures in this layer correspond to the source system table structures "as-is," along with any additional metadata columns that capture the load date/time, process ID, etc. The focus in this layer is quick Change Data Capture and the ability to provide an historical archive of source (cold storage), data lineage, auditability, reprocessing if needed without rereading the data from the source system.
https://www.databricks.com/glossary/medallion-architecture#:~:text=Bronze%20layer%20%28raw%20data%29
E
Bronze tables are the foundation of the Delta Lake data lake architecture. They are created from raw data files and contain a schema that describes the data. This makes it easy to query and analyze the data in Bronze tables.
Raw data files, on the other hand, do not have a schema applied. This means that it can be difficult to query and analyze the data in raw data files.
Option A: Bronze tables typically contain more data than raw data files, because they include the schema.
Option B: There is no indication that Bronze tables contain more truthful data than raw data.
Option C: Bronze tables can contain aggregates, but they do not have to.
Option D: Bronze tables typically contain a more refined view of data than raw data, because they include the schema.
A voting comment increases the vote count for the chosen answer by one.
Upvoting a comment with a selected answer will also increase the vote count towards that answer by one.
So if you see a comment that you already agree with, you can upvote it instead of posting a new comment.
XiltroX
Highly Voted 1 year, 7 months ago806e7d2
Most Recent 3 days agojoaosanti
1 month, 3 weeks agobenni_ale
6 months, 3 weeks agoSerGrey
10 months, 3 weeks agoawofalus
1 year agoDavidRou
1 year agovctrhugo
1 year, 2 months agoakk_1289
1 year, 4 months agoakk_1289
1 year, 4 months agoAtnafu
1 year, 4 months agoAtnafu
1 year, 4 months agoAtnafu
1 year, 4 months agorafahb
1 year, 7 months agosurrabhi_4
1 year, 7 months ago