Which of the following code blocks returns a DataFrame where column managerName from DataFrame storesDF is split at the space character into column managerFirstName and column managerLastName?
A sample of DataFrame storesDF is displayed below:
A.
(storesDF.withColumn("managerFirstName", split(col("managerName"), " ")[0]) .withColumn("managerLastName", split(col("managerName"), " ")[1]))
B.
(storesDF.withColumn("managerFirstName", col("managerName"). split(" ")[1]) .withColumn("managerLastName", col("managerName").split(" ")[2]))
C.
(storesDF.withColumn("managerFirstName", split(col("managerName"), " ")[1]) .withColumn("managerLastName", split(col("managerName"), " ")[2]))
D.
(storesDF.withColumn("managerFirstName", col("managerName").split(" ")[0]) .withColumn("managerLastName", col("managerName").split(" ")[1]))
E.
(storesDF.withColumn("managerFirstName", split("managerName"), " ")[0]) .withColumn("managerLastName", split("managerName"), " ")[1]))
Right answer is A as the array returned by the split function is 0-based not 1-based. You can try yourself with the following code:
df = spark.createDataFrame([('John Doe',)], ['Person',])
df2 = df \
.withColumn('first_name', split(col('Person'), ' ')[0]) \
.withColumn('last_name', split(col('Person'), ' ')[1])
df2.show()
A voting comment increases the vote count for the chosen answer by one.
Upvoting a comment with a selected answer will also increase the vote count towards that answer by one.
So if you see a comment that you already agree with, you can upvote it instead of posting a new comment.
thinkbang
1 month, 2 weeks agomax_manfred
2 months, 2 weeks agosofiess
2 months, 4 weeks ago