The code block shown below contains an error. The code block intended to create a single-column DataFrame from Scala List years which is made up of integers. Identify the error.
Code block:
spark.createDataset(years)
A.
The years list should be wrapped in another list like List(years) to make clear that it is a column rather than a row.
B.
The data type is not specified – the second argument to createDataset should be IntegerType.
C.
There is no operation createDataset – the createDataFrame operation should be used instead.
D.
The result of the above is a Dataset rather than a DataFrame – the toDF operation must be called at the end.
E.
The column name must be specified as the second argument to createDataset.
D is the correct answer because the code creates a Dataset, not a DataFrame, and needs .toDF() to convert it.
B is incorrect because the type is inferred automatically by Spark and you do not need to specify IntegerType explicitly.
It should be D.
Scala has a createDataset function which returns a dataset - where then toDF has to be called.
Doc: https://spark.apache.org/docs/latest/api/scala/org/apache/spark/sql/Dataset.html
Official Databricks tests (where answer is A)
Question 44 Which of the following code blocks creates a single-column DataFrame from Scala Listyears which is made up of integers? A. spark.createDataset(years).toDF B. spark.createDataFrame(years, IntegerType) C. spark.createDataset(years) D. spark.DataFrame(years, IntegerType) E. spark.createDataFrame(years)
C. There is no operation createDataset – the createDataFrame operation should be used instead.
The correct method to create a DataFrame in Spark using Scala is createDataFrame, not createDataset. The correct syntax would be:
scala
Copy code
val df = spark.createDataFrame(years.map(Tuple1.apply)).toDF("columnName")
This assumes that years is a List of integers, and the resulting DataFrame will have a single column named "columnName".
A voting comment increases the vote count for the chosen answer by one.
Upvoting a comment with a selected answer will also increase the vote count towards that answer by one.
So if you see a comment that you already agree with, you can upvote it instead of posting a new comment.
ARUNKUMARKRISHNASAMY
1 month, 2 weeks agoThameur01
1 month, 4 weeks agobublitz
6 months, 1 week agoDharma49
9 months, 1 week agodeadbeef38
10 months agoSowwy1
1 year agoSowwy1
1 year agotangerine141
1 year, 2 months agozozoshanky
1 year, 9 months ago