Which of the following code blocks returns a DataFrame containing only the rows from DataFrame storesDF where the value in column sqft is less than or equal to 25,000 AND the value in column customerSatisfaction is greater than or equal to 30?
A.
storesDF.filter(col("sqft") <= 25000 and col("customerSatisfaction") >= 30)
B.
storesDF.filter(col("sqft") <= 25000 or col("customerSatisfaction") >= 30)
C.
storesDF.filter(sqft) <= 25000 and customerSatisfaction >= 30)
D.
storesDF.filter(col("sqft") <= 25000 & col("customerSatisfaction") >= 30)
E.
storesDF.filter(sqft <= 25000) & customerSatisfaction >= 30)
in pyspark, all wrong as the conditions inside the filter should be wrapped inside parentesis. should be: D. storesDF.filter((col("sqft") <= 25000) & (col("customerSatisfaction") >= 30))
It's D:
https://sparkbyexamples.com/spark/spark-and-or-not-operators/
PySpark Logical operations use the bitwise operators:
& for and
| for or
~ for not
A voting comment increases the vote count for the chosen answer by one.
Upvoting a comment with a selected answer will also increase the vote count towards that answer by one.
So if you see a comment that you already agree with, you can upvote it instead of posting a new comment.
gaco
Highly Voted 7 months, 3 weeks agoPushpakKothekar
Most Recent 1 week, 2 days agoSouvik_79
2 months, 3 weeks agoJgo1986
7 months, 2 weeks ago65bd33e
8 months agodeadbeef38
9 months, 4 weeks agoJgo1986
7 months, 2 weeks agoSowwy1
1 year agosionita
1 year, 4 months agoMSH_6
1 year, 8 months agonewusername
1 year, 5 months ago