The operation that can be used to create a DataFrame with a subset of columns from DataFrame storesDF that are specified by name is storesDF.select().
The select() operation allows you to specify the columns you want to keep in the resulting DataFrame by passing in the column names as arguments. For example, to create a new DataFrame that contains only the columns store_id and store_name from the storesDF
DataFrame, you can use the following code:
newDF = storesDF.select("store_id", "store_name")
E.storesDF.drop() is also correct. It is just opposite of select. If you have a large number of columns you need to select but a few to drop to meet your requirements, then drop is easier than select.
The select() operation in Spark DataFrame allows you to specify the columns you want to include in the resulting DataFrame. You can provide column names as arguments to the select() operation to create a new DataFrame with only the specified columns.
upvoted 2 times
...
Log in to ExamTopics
Sign in:
Community vote distribution
A (35%)
C (25%)
B (20%)
Other
Most Voted
A voting comment increases the vote count for the chosen answer by one.
Upvoting a comment with a selected answer will also increase the vote count towards that answer by one.
So if you see a comment that you already agree with, you can upvote it instead of posting a new comment.
4be8126
Highly Voted 1 year, 7 months agoYoSpark
Most Recent 4 months agoTmData
1 year, 5 months ago