Which of the following code blocks returns a DataFrame containing only the rows from DataFrame storesDF where the value in column sqft is less than or equal to 25,000 AND the value in column customerSatisfaction is greater than or equal to 30?
A.
storesDF.filter(col("sqft") <= 25000 and col("customerSatisfaction") >= 30)
B.
storesDF.filter(col("sqft") <= 25000 or col("customerSatisfaction") >= 30)
C.
storesDF.filter(sqft) <= 25000 and customerSatisfaction >= 30)
D.
storesDF.filter(col("sqft") <= 25000 & col("customerSatisfaction") >= 30)
E.
storesDF.filter(sqft <= 25000) & customerSatisfaction >= 30)
in pyspark, all wrong as the conditions inside the filter should be wrapped inside parentesis. should be: D. storesDF.filter((col("sqft") <= 25000) & (col("customerSatisfaction") >= 30))
It's D:
https://sparkbyexamples.com/spark/spark-and-or-not-operators/
PySpark Logical operations use the bitwise operators:
& for and
| for or
~ for not
No, you do not use and but & in Pyspark
D is correct
upvoted 2 times
...
...
Log in to ExamTopics
Sign in:
Community vote distribution
A (35%)
C (25%)
B (20%)
Other
Most Voted
A voting comment increases the vote count for the chosen answer by one.
Upvoting a comment with a selected answer will also increase the vote count towards that answer by one.
So if you see a comment that you already agree with, you can upvote it instead of posting a new comment.
Jgo1986
2 months, 3 weeks agogaco
3 months ago65bd33e
3 months, 1 week agodeadbeef38
5 months agoJgo1986
2 months, 3 weeks agoSowwy1
7 months, 3 weeks agosionita
1 year agoMSH_6
1 year, 3 months agonewusername
1 year ago