Exam DP-600 All Questions

View all questions & answers for the DP-600 exam

Exam DP-600 topic 1 question 41 discussion

Actual exam question from Microsoft's DP-600

Question #: 41
Topic #: 1

HOTSPOT -
You have a Fabric tenant that contains a lakehouse.
You are using a Fabric notebook to save a large DataFrame by using the following code. df.write.partitionBy(“year”, “month”, “day”).mode(“overwrite”).parquet(“Files/SalesOrder”)
For each of the following statements, select Yes if the statement is true. Otherwise, select No.
NOTE: Each correct selection is worth one point.

Show Suggested Answer

Suggested Answer:

by Momoanwar at Feb. 18, 2024, 12:54 a.m.

Comments

Submit Cancel

Momoanwar

Highly Voted 1 year, 2 months ago

I think yes yes yes. Parquet= compression

upvoted 37 times

Blue_MSBI

1 year, 2 months ago

I think the same

upvoted 7 times

Training_ND

8 months, 1 week ago

Yup, agreed

upvoted 1 times

vimalan

4 months ago

agreed Yes, the resulting file partitions will use file compression. When you save a DataFrame in Parquet format using the df.write.partitionBy(...).mode("overwrite").parquet(...) method, Parquet files are automatically compressed by defaut

upvoted 2 times

goldy29

2 months, 3 weeks ago

The .parquet() format supports compression, but compression is not automatically applied unless specified (e.g., df.write.option("compression", "snappy")). By default, Parquet often uses Snappy compression, but this depends on the implementation.

upvoted 2 times

...

estrelle2008

Highly Voted 1 year, 2 months ago

I think so too: YYY code snippet according to Learn

upvoted 8 times

estrelle2008

1 year, 2 months ago

additional: Parquet files are compressed by default, and you don’t need to take any additional actions to enable compression. When writing Parquet files, you can specify the desired compression codec (if needed) to further optimize storage and performance

upvoted 5 times

...

Rakesh16

Most Recent 5 months, 2 weeks ago

yes,yes,yes

upvoted 1 times

...

Mitchell12345

7 months, 3 weeks ago

Technically the results will not form a hierarchy of folders for EACH partition key right? Because for the day partition key, files are created. If it was phrased generically without the each-part, would've been better. Did someone have it on the exam?

upvoted 1 times

AdventureChick

4 months, 2 weeks ago

Yes, there is a level for each "entity" in the partition key. Even if you only had 1 file in a day, that file would be found under a "day" folder.

upvoted 1 times

...

dev2dev

10 months, 4 weeks ago

Yes Yes Yes The compression is optional parameter which uses 'snappy' compression by default. Unless we specifiy none, compression happens.

upvoted 3 times

...

282b85d

11 months ago

Y-Y-N The results will form a hierarchy of folders for each partition key: Yes: When using partitionBy in Spark, the data is organized into a hierarchical directory structure based on the specified partition keys. Therefore, you will see directories like year=YYYY/month=MM/day=DD within the specified output path. The resulting file partitions can be read in parallel across multiple nodes: Yes: Parquet files are designed for efficient querying and support parallel processing. Spark can read these partitions in parallel, enabling distributed query execution across multiple nodes. The resulting file partitions will use file compression: No: While Parquet format supports compression, it is not enabled by default in the code snippet provided. Compression needs to be explicitly specified if required. For example, you could use .option("compression", "snappy") to enable Snappy compression.

upvoted 3 times

vernillen

10 months, 3 weeks ago

Snappy is the default compression type of parquet, though. So it will create chunks of your file by default and compress them. If you want to have it all in one file, that's when you have to overwrite your default compression. So I disagree with the 'No'

upvoted 6 times

...

stilferx

11 months, 3 weeks ago

IMHO, fully agree with colleagues below - Y -> Y -> Y

upvoted 2 times

...

Nefirs

1 year ago

i think YYY as well

upvoted 2 times

...

XiltroX

1 year, 2 months ago

I think it should be NYY as there is no mention in the code to form a hierarchy. Please correct me if I'm wrong.

upvoted 1 times

4fd861f

1 year, 1 month ago

partitionBy will create it

upvoted 4 times

...

Exam DP-600 All Questions

View all questions & answers for the DP-600 exam

Exam DP-600 topic 1 question 41 discussion

Comments

Momoanwar

Blue_MSBI

Training_ND

vimalan

goldy29

estrelle2008

estrelle2008

Rakesh16

Mitchell12345

AdventureChick

dev2dev

282b85d

vernillen

stilferx

Nefirs

XiltroX

4fd861f

SY0-701