Exam Certified Data Engineer Professional All Questions

View all questions & answers for the Certified Data Engineer Professional exam

Go to Exam

Exam Certified Data Engineer Professional topic 1 question 52 discussion

Actual exam question from Databricks's Certified Data Engineer Professional

Question #: 52
Topic #: 1

[All Certified Data Engineer Professional Questions]

Review the following error traceback:

Which statement describes the error being raised?

A. The code executed was PySpark but was executed in a Scala notebook.
B. There is no column in the table named heartrateheartrateheartrate
C. There is a type error because a column object cannot be multiplied.
D. There is a type error because a DataFrame object cannot be multiplied.
E. There is a syntax error because the heartrate column is not correctly identified as a column.

Show Suggested Answer

Suggested Answer: B 🗳️

by CertPeople at Sept. 12, 2023, 8:47 a.m.

Comments

Submit Cancel

CertPeople

Highly Voted 1 year, 4 months ago

Selected Answer: B

It's B, there is no column with that name

upvoted 8 times

...

rok21

Highly Voted 1 year, 1 month ago

Selected Answer: E

E is correct

upvoted 5 times

...

KadELbied

Most Recent 2 months, 1 week ago

Selected Answer: B

Suretly B

upvoted 1 times

...

B is correct. I created a simple df and ran this code : display(df.select(3*"heartrate")) and I got this error: AnalysisException: [UNRESOLVED_COLUMN.WITH_SUGGESTION] A column or function parameter with name `heartrateheartrateheartrate` cannot be resolved. Did you mean one of the following? [`heartrate`, `device_id`, `time`, `mrn`].; 'Project ['heartrateheartrateheartrate] +- LogicalRDD [device_id#2L, heartrate#3L, mrn#4L, time#5], false

upvoted 1 times

...

examtopicsms99

3 months, 1 week ago

Selected Answer: C

from pyspark.sql import SparkSession from pyspark.sql.functions import col # Initialize Spark session spark = SparkSession.builder.appName("ErrorReproduction").getOrCreate() # Create a sample DataFrame with similar structure data = [ (1, 72, 12345, "2023-01-01 10:00:00"), (2, 68, 67890, "2023-01-01 10:01:00"), (3, 75, 54321, "2023-01-01 10:02:00") ] columns = ["device_id", "heartrate", "mrn", "time"] df = spark.createDataFrame(data, columns) # This will produce the AnalysisException # The error occurs because you're trying to multiply a string "heartrate" by 3 literally # instead of referencing the column and multiplying its values display(df.select(3*"heartrate")) # display(df.select(3*col("heartrate")))

upvoted 1 times

...

guillesd

11 months, 2 weeks ago

Selected Answer: B

It's B. Regarding E, a syntax error would mean that the query is not valid due to a wrongfully written SQL statement. However, this is not the case. The column just does not exist.

upvoted 2 times

...

Jay_98_11

1 year ago

Selected Answer: B

https://sparkbyexamples.com/spark/spark-cannot-resolve-given-input-columns/

upvoted 1 times

...

Gulenur_GS

1 year, 1 month ago

the answer is E, because df.select(3*df['heartrate']).show() perfectly returns

upvoted 2 times

chokthewa

12 months ago

3*"heartrate" is triple of string "heartrate" ,isn't value of heartrate multiplied by 3.

upvoted 1 times

...

Gulenur

1 year, 1 month ago

Answer is E df.select(3*df['heartrate']) returns perfect result without error

upvoted 2 times

...

npc0001

1 year, 2 months ago

Selected Answer: B

Answer B

upvoted 2 times

...

Dileepvikram

1 year, 2 months ago

Answer is B

upvoted 2 times

...

sturcu

1 year, 3 months ago

Selected Answer: B

No such column found

upvoted 2 times

...

Exam Certified Data Engineer Professional All Questions

View all questions & answers for the Certified Data Engineer Professional exam

Exam Certified Data Engineer Professional topic 1 question 52 discussion

Comments

CertPeople

rok21

KadELbied

Stalker200

examtopicsms99

guillesd

Jay_98_11

Gulenur_GS

chokthewa

Gulenur

npc0001

Dileepvikram

sturcu