Exam Certified Associate Developer for Apache Spark All Questions

View all questions & answers for the Certified Associate Developer for Apache Spark exam

Exam Certified Associate Developer for Apache Spark topic 1 question 59 discussion

Actual exam question from Databricks's Certified Associate Developer for Apache Spark

Question #: 59
Topic #: 1

[All Certified Associate Developer for Apache Spark Questions]

The code block shown below contains an error. The code block intended to read a parquet at the file path filePath into a DataFrame. Identify the error.
Code block:
spark.read.load(filePath, source – "parquet")

A. There is no source parameter to the load() operation – the schema parameter should be used instead.
B. There is no load() operation – it should be parquet() instead.
C. The spark.read operation should be followed by parentheses to return a DataFrameReader object.
D. The filePath argument to the load() operation should be quoted.
E. There is no source parameter to the load() operation – it can be removed.

Show Suggested Answer

Suggested Answer: E 🗳️

by 4be8126 at May 3, 2023, 12:05 p.m.

Comments

Submit Cancel

juliom6

4 months, 3 weeks ago

Selected Answer: E

E is correct. The "format" parameter should be used instead of "source" (default "parquet"): https://spark.apache.org/docs/latest/api/python/reference/pyspark.sql/api/pyspark.sql.DataFrameReader.load.html format: str, optional optional string for format of the data source. Default to ‘parquet’.

upvoted 1 times

...

newusername

5 months ago

Selected Answer: E

I would go for E

upvoted 1 times

...

Singh_Sumit

6 months, 1 week ago

spark.read.load(PARQUET_PATH,format='parquet') Load is valid, if provided with format.

upvoted 1 times

...

Ram459

7 months, 3 weeks ago

Selected Answer: E

Intention is to read a parquet at the file path filePath into a DataFrame

upvoted 2 times

...

cookiemonster42

8 months, 1 week ago

The parameters for load() function are: path, format, schema, **options A. Overall it makes sense, but do we really need to use schema? B. There is load operation, that's FALSE C. read is used without parenthesis, FALSE D. It should indeed, but there's no source parameter, FALSE E. That's true, but we need to put quotes for the filePath, then it's FALSE Makes it A, but the question is really strange and not clear.

upvoted 2 times

cookiemonster42

8 months, 1 week ago

UPD - parquet already has schema in it, it's not needed, then, I don't know what the answer is then

upvoted 2 times

...

Larrave

9 months, 2 weeks ago

Selected Answer: E

Answer should be E. Removing source and default is 'parquet' anyway. However, it is not ideal to use load, rather the respective method. https://spark.apache.org/docs/3.1.1/api/python/reference/api/pyspark.sql.DataFrameReader.load.html?highlight=dataframereader%20load#pyspark.sql.DataFrameReader.load

upvoted 3 times

...

ZSun

10 months ago

1. pyspark.sql.SparkSession.read Returns a DataFrameReader https://spark.apache.org/docs/latest/api/python/reference/pyspark.sql/api/pyspark.sql.SparkSession.read.html#pyspark.sql.SparkSession.read 2. we check this DataFrameReader, it contains both "load" and "parquet" methods. 2.1. for load, load(path, format, schema) https://spark.apache.org/docs/latest/api/python/reference/pyspark.sql/api/pyspark.sql.DataFrameReader.load.html#pyspark.sql.DataFrameReader.load Therefore, the answer is A or E. Typically parquet contains schema information. I do not like this question, because if reading a parquet file, directly use spark.read.parquet()

upvoted 2 times

...

4be8126

11 months, 1 week ago

Selected Answer: B

The correct code block to read a parquet file would be spark.read.parquet(filePath).

upvoted 4 times

tmz1

2 months, 1 week ago

the statement "there is no load() operation" in answer B is clearly wrong as this operation exists in pyspark. E is correct answer - instead of source parameter, you should use format to achieve the same result

upvoted 1 times

...

Exam Certified Associate Developer for Apache Spark All Questions

View all questions & answers for the Certified Associate Developer for Apache Spark exam

Exam Certified Associate Developer for Apache Spark topic 1 question 59 discussion

Comments

juliom6

newusername

Singh_Sumit

Ram459

cookiemonster42

cookiemonster42

Larrave

ZSun

4be8126

tmz1

SY0-701