Welcome to ExamTopics
ExamTopics Logo
- Expert Verified, Online, Free.
exam questions

Exam Certified Data Engineer Associate All Questions

View all questions & answers for the Certified Data Engineer Associate exam

Exam Certified Data Engineer Associate topic 1 question 58 discussion

Actual exam question from Databricks's Certified Data Engineer Associate
Question #: 58
Topic #: 1
[All Certified Data Engineer Associate Questions]

Which of the following describes a benefit of creating an external table from Parquet rather than CSV when using a CREATE TABLE AS SELECT statement?

  • A. Parquet files can be partitioned
  • B. CREATE TABLE AS SELECT statements cannot be used on files
  • C. Parquet files have a well-defined schema
  • D. Parquet files have the ability to be optimized
  • E. Parquet files will become Delta tables
Show Suggested Answer Hide Answer
Suggested Answer: C 🗳️

Comments

Chosen Answer:
This is a voting comment (?) , you can switch to a simple comment.
Switch to a voting comment New
1a44567
4 months, 2 weeks ago
Vote for D Parquet files are a columnar storage file format that allows for efficient data compression and encoding schemes, enabling optimization and faster query performance compared to CSV files. This format supports efficient reading and writing of large datasets, making it a preferred choice for big data applications.
upvoted 1 times
...
MDWPartners
6 months ago
Selected Answer: C
The keywords are "CREATE TABLE AS SELECT "
upvoted 1 times
...
benni_ale
7 months ago
Selected Answer: C
C is correct
upvoted 1 times
...
UGOTCOOKIES
10 months ago
Selected Answer: C
CREATE TABLE AS SELECT adopts the schema details from the source. Parquet files have a defined schema.
upvoted 2 times
...
bartfto
10 months, 3 weeks ago
Selected Answer: C
C. Paruqet has well defined schema unline csv
upvoted 1 times
...
Garyn
11 months ago
Selected Answer: C
C. Parquet files have a well-defined schema. Explanation: Parquet files inherently store metadata about the schema within the files themselves, allowing for a well-defined schema. This schema information includes data types, column names, and other structural information. When creating an external table from Parquet, this schema is retained, providing a structured and well-defined format for the data. It ensures consistency and enables more efficient processing, query optimization, and compatibility across various systems or tools that work with the Parquet format. This structured schema within Parquet files offers advantages in terms of data integrity, ease of data processing, and compatibility, making it a beneficial choice over CSV, which lacks inherent schema information and might need additional handling or inference of schema during data ingestion.
upvoted 1 times
...
AndreFR
11 months, 1 week ago
Selected Answer: B
The key word here is : CREATE TABLE AS SELECT not A : partitioning is not relevant in a create table as statement because the data will be created in a delta table not C : Parquet schema is not well defined and there can be parquet files with multiple schema in a folder not D : Parquet are already optimized and are not relevant in a create table as statement because the data will be created in a delta table not E : both CSV & Parquet will become delta tables in a create table as statement B : correct answer by elimination
upvoted 1 times
...
nedlo
11 months, 2 weeks ago
Selected Answer: D
I disagree i think its D. Schema can be inferred from CSV as well, but CSV cannot provide same optimizations as Parquet
upvoted 1 times
...
FastEddie
1 year ago
Selected Answer: C
CTAS - CTAS automatically infer schema information from query results and do not support manual schema declaration.This means that CTAS statements are useful for external data ingestion from sources with well-defined schema, such as Parquet files and tables.CTAS statements also do not support specifying additional file options.
upvoted 4 times
...
kishore1980
1 year ago
Selected Answer: C
C is the correct option
upvoted 2 times
...
anandpsg101
1 year, 1 month ago
Selected Answer: C
c is correct
upvoted 2 times
...
meow_akk
1 year, 1 month ago
Ans : C https://www.databricks.com/glossary/what-is-parquet#:~:text=Columnar%20storage%20like%20Apache%20Parquet,compared%20to%20row%2Doriented%20databases. Columnar storage like Apache Parquet is designed to bring efficiency compared to row-based files like CSV. When querying, columnar storage you can skip over the non-relevant data very quickly. As a result, aggregation queries are less time-consuming compared to row-oriented databases.
upvoted 4 times
...
kbaba101
1 year, 1 month ago
C. it supports well-defined schema, such as Parquet files and tables and do not support specifying additional file options such as Delimeter if you were to use CSV
upvoted 4 times
...
Community vote distribution
A (35%)
C (25%)
B (20%)
Other
Most Voted
A voting comment increases the vote count for the chosen answer by one.

Upvoting a comment with a selected answer will also increase the vote count towards that answer by one. So if you see a comment that you already agree with, you can upvote it instead of posting a new comment.

SaveCancel
Loading ...