Exam Certified Machine Learning Associate topic 1 question 2 discussion

Actual exam question from Databricks's Certified Machine Learning Associate

Question #: 2
Topic #: 1

[All Certified Machine Learning Associate Questions]

A data scientist has a Spark DataFrame spark_df. They want to create a new Spark DataFrame that contains only the rows from spark_df where the value in column price is greater than 0.
Which of the following code blocks will accomplish this task?

A. spark_df[spark_df["price"] > 0]
B. spark_df.filter(col("price") > 0)
C. SELECT * FROM spark_df WHERE price > 0
D. spark_df.loc[spark_df["price"] > 0,:]
E. spark_df.loc[:,spark_df["price"] > 0]

Show Suggested Answer

Suggested Answer: B 🗳️

by [deleted] at May 31, 2024, 8:24 p.m.

Comments

Submit Cancel

Deuterium44

5 months, 2 weeks ago

Selected Answer: B

B, given answer is correct

upvoted 1 times

...

Shubhamdh1

7 months, 3 weeks ago

Selected Answer: B

spark_df.filter(col("price") > 0) this is correct answer

upvoted 3 times

...

Spark_Knight

10 months, 1 week ago

B is correct

upvoted 3 times

...

[Removed]

10 months, 3 weeks ago

Selected Answer: A

Both A and B are valid ways to filter a Spark DataFrame. You could argue that A is slightly "more" correct since option B requires you to import "pyspark.sql.functions.col"

upvoted 2 times

...

Exam Certified Machine Learning Associate All Questions

View all questions & answers for the Certified Machine Learning Associate exam