Welcome to ExamTopics
ExamTopics Logo
- Expert Verified, Online, Free.
exam questions

Exam Certified Associate Developer for Apache Spark All Questions

View all questions & answers for the Certified Associate Developer for Apache Spark exam

Exam Certified Associate Developer for Apache Spark topic 1 question 43 discussion

The code block shown below contains an error. The code block is intended to use SQL to return a new DataFrame containing column storeId and column managerName from a table created from DataFrame storesDF. Identify the error.
Code block:
storesDF.createOrReplaceTempView("stores")
storesDF.sql("SELECT storeId, managerName FROM stores")

  • A. The createOrReplaceTempView() operation does not make a Dataframe accessible via SQL.
  • B. The sql() operation should be accessed via the spark variable rather than DataFrame storesDF.
  • C. There is the sql() operation in DataFrame storesDF. The operation query() should be used instead.
  • D. This cannot be accomplished using SQL – the DataFrame API should be used instead.
  • E. The createOrReplaceTempView() operation should be accessed via the spark variable rather than DataFrame storesDF.
Show Suggested Answer Hide Answer
Suggested Answer: B 🗳️

Comments

Chosen Answer:
This is a voting comment (?) , you can switch to a simple comment.
Switch to a voting comment New
jds0
2 months, 3 weeks ago
Selected Answer: B
B is correct. 'storeDF' has not attribute or method `sql` Test code below: from pyspark.sql import SparkSession spark = SparkSession.builder.appName("MyApp").getOrCreate() data = [ (0, 3, "John"), (1, 1, "Jane"), (2, 2, "Jack"), ] storesDF = spark.createDataFrame(data, ["storeID", "customerSatisfaction", "managerName"]) storesDF.createOrReplaceTempView("stores") try: storesDF.sql("SELECT storeId, managerName FROM stores") except AttributeError as e: print(e) finally: spark.sql("SELECT storeId, managerName FROM stores").show()
upvoted 1 times
...
juliom6
11 months, 2 weeks ago
Selected Answer: B
B is correct: storesDF = spark.createDataFrame([('1', 'juan'), ('2', 'perez')], ['storeId', 'managerName']) storesDF.createOrReplaceTempView("stores") spark.sql("SELECT storeId, managerName FROM stores").show()
upvoted 2 times
...
4be8126
1 year, 5 months ago
Selected Answer: B
Option B is correct because the sql() function is not a method of a DataFrame object. It is actually a method of the SparkSession object spark. Therefore, the correct way to execute a SQL statement using Spark SQL is to call sql() on the SparkSession object as follows: spark.sql("SELECT storeId, managerName FROM stores") In the code block provided in the question, sql() is called on a DataFrame object, which will result in a DataFrame object without executing the SQL statement. Therefore, option B correctly identifies the error in the code block.
upvoted 2 times
...
Community vote distribution
A (35%)
C (25%)
B (20%)
Other
Most Voted
A voting comment increases the vote count for the chosen answer by one.

Upvoting a comment with a selected answer will also increase the vote count towards that answer by one. So if you see a comment that you already agree with, you can upvote it instead of posting a new comment.

SaveCancel
Loading ...