Welcome to ExamTopics
ExamTopics Logo
- Expert Verified, Online, Free.
exam questions

Exam Certified Associate Developer for Apache Spark All Questions

View all questions & answers for the Certified Associate Developer for Apache Spark exam

Exam Certified Associate Developer for Apache Spark topic 1 question 29 discussion

The code block shown contains an error. The code block is intended to return a new DataFrame where column sqft from DataFrame storesDF has had its missing values replaced with the value 30,000. Identify the error.
A sample of DataFrame storesDF is displayed below:

Code block:
storesDF.na.fill(30000, col("sqft"))

  • A. The argument to the subset parameter of fill() should be a string column name or a list of string column names rather than a Column object.
  • B. The na.fill() operation does not work and should be replaced by the dropna() operation.
  • C. he argument to the subset parameter of fill() should be a the numerical position of the column rather than a Column object.
  • D. The na.fill() operation does not work and should be replaced by the nafill() operation.
  • E. The na.fill() operation does not work and should be replaced by the fillna() operation.
Show Suggested Answer Hide Answer
Suggested Answer: A 🗳️

Comments

Chosen Answer:
This is a voting comment (?) , you can switch to a simple comment.
Switch to a voting comment New
ZSun
Highly Voted 1 year, 5 months ago
Correct anwser is A. even for most updated version, spark 3.4. na.fill() still functioning, it is an alias of fillna() Mr. 4be8126 , 你可真是张嘴就来啊
upvoted 6 times
...
jds0
Most Recent 4 months ago
Selected Answer: A
The most correct answer seems to be A: Code below with Spark 3.5.1. ```python from pyspark.sql import SparkSession from pyspark.sql.functions import col from pyspark.errors import PySparkTypeError spark = SparkSession.builder.appName("MyApp").getOrCreate() data = [ (0, 43161), (1, 51200), (2, None), (3, 78367), (4, None), ] storesDF = spark.createDataFrame(data, ["storeID", "sqft"]) # storesDF.show() try: storesDF.na.fill(30000, col("sqft")) except PySparkTypeError as e: print(e) storesDF.na.fill(30000, "sqft").show() storesDF.na.fill(30000, ["sqft"]).show() storesDF.fillna(30000, ["sqft"]).show() storesDF.fillna(30000, "sqft").show() ```
upvoted 2 times
...
azure_bimonster
9 months, 3 weeks ago
Selected Answer: A
We don't need any replacement here. A would be correct. In PySpark both fillna() and fill() are used to replace missing or null values of a DataFrame. Functionally they both perform same. One can choose either of these based on preference. These are used mainly for handling missing data in PySpark.
upvoted 1 times
...
4be8126
1 year, 7 months ago
The correct answer is either A or E, depending on the version of Spark being used. In Spark 2.x, the correct method to replace missing values is na.fill(). Option A is correct in Spark 2.x, as it correctly specifies the column to apply the fill operation to using a Column object. However, in Spark 3.x, the method has been renamed to fillna(). Therefore, in Spark 3.x, the correct answer is E, as it uses the correct method name. Both A and E accomplish the same task of replacing missing values in the sqft column with 30,000, so either answer can be considered correct depending on the version of Spark being used.
upvoted 2 times
...
peekaboo15
1 year, 7 months ago
Selected Answer: A
the answer should be A. See this link for reference https://sparkbyexamples.com/pyspark/pyspark-fillna-fill-replace-null-values/
upvoted 2 times
...
TC007
1 year, 7 months ago
Selected Answer: E
The error in the code block is that the method na.fill() should be replaced by fillna() to fill the missing values in the column "sqft" with the value 30,000.
upvoted 1 times
...
Community vote distribution
A (35%)
C (25%)
B (20%)
Other
Most Voted
A voting comment increases the vote count for the chosen answer by one.

Upvoting a comment with a selected answer will also increase the vote count towards that answer by one. So if you see a comment that you already agree with, you can upvote it instead of posting a new comment.

SaveCancel
Loading ...