exam questions

Exam DP-203 All Questions

View all questions & answers for the DP-203 exam

Exam DP-203 topic 1 question 47 discussion

Actual exam question from Microsoft's DP-203
Question #: 47
Topic #: 1
[All DP-203 Questions]

DRAG DROP -
You use PySpark in Azure Databricks to parse the following JSON input.

You need to output the data in the following tabular format.

How should you complete the PySpark code? To answer, drag the appropriate values to the correct targets. Each value may be used once, more than once, or not at all. You may need to drag the spit bar between panes or scroll to view content.
NOTE: Each correct selection is worth one point.
Select and Place:

Show Suggested Answer Hide Answer
Suggested Answer:
Box 1: select -

Box 2: explode -

Bop 3: alias -
pyspark.sql.Column.alias returns this column aliased with a new name or names (in the case of expressions that return more than one column, such as explode).
Reference:
https://spark.apache.org/docs/latest/api/python/reference/api/pyspark.sql.Column.alias.html https://docs.microsoft.com/en-us/azure/databricks/sql/language-manual/functions/explode

Comments

Chosen Answer:
This is a voting comment (?). It is better to Upvote an existing comment if you don't have anything to add.
Switch to a voting comment New
kb8bo
Highly Voted 1 year, 10 months ago
The final line with the blank looks incorrect... surely it should be: explode("persons.dogs").alias("dog")) (Assuming this, the answer is correct, otherwise I don't think it makes any sense).
upvoted 24 times
...
urassi
Highly Voted 1 year, 1 month ago
ah "persons".alias("persons") what a fun and useful and nice alias
upvoted 18 times
hypersam
3 months, 2 weeks ago
actually if you don't use alias after explode("persons"), the column name would be "col" so it's mandatory here.
upvoted 2 times
...
...
kkk5566
Most Recent 7 months, 2 weeks ago
syntax is correct
upvoted 2 times
...
esaade
1 year, 1 month ago
dbutils.fs.put("/tmp/source.json", source_json, True) source_df = spark.read.option("multiline", "true").json("/tmp/source.json") persons = source_df.select(explode("persons").alias("persons")) persons_dogs = persons.select(col("persons.name").alias("owner"), col("persons.age").alias("age"), explode(col("persons.dog")).alias("dog_name")) persons_dogs.display()
upvoted 12 times
...
Deeksha1234
1 year, 8 months ago
Correct
upvoted 4 times
...
Dicer
1 year, 9 months ago
Correct, but last .alias("dog") is quite unnecessary because the column name is alredy 'dog'. I guess that is for safety measurement.
upvoted 5 times
Anton2020
1 year, 1 month ago
The column name in the json is dogs, not dog
upvoted 3 times
...
...
galacaw
1 year, 11 months ago
Correct
upvoted 4 times
...
Community vote distribution
A (35%)
C (25%)
B (20%)
Other
Most Voted
A voting comment increases the vote count for the chosen answer by one.

Upvoting a comment with a selected answer will also increase the vote count towards that answer by one. So if you see a comment that you already agree with, you can upvote it instead of posting a new comment.

SaveCancel
Loading ...
exam
Someone Bought Contributor Access for:
SY0-701
London, 1 minute ago