exam questions

Exam DP-200 All Questions

View all questions & answers for the DP-200 exam

Exam DP-200 topic 2 question 15 discussion

Actual exam question from Microsoft's DP-200
Question #: 15
Topic #: 2
[All DP-200 Questions]

Each day, company plans to store hundreds of files in Azure Blob Storage and Azure Data Lake Storage. The company uses the parquet format.
You must develop a pipeline that meets the following requirements:
✑ Process data every six hours
✑ Offer interactive data analysis capabilities
✑ Offer the ability to process data using solid-state drive (SSD) caching
✑ Use Directed Acyclic Graph(DAG) processing mechanisms
✑ Provide support for REST API calls to monitor processes
✑ Provide native support for Python
✑ Integrate with Microsoft Power BI
You need to select the appropriate data technology to implement the pipeline.
Which data technology should you implement?

  • A. Azure SQL Data Warehouse
  • B. HDInsight Apache Storm cluster
  • C. Azure Stream Analytics
  • D. HDInsight Apache Hadoop cluster using MapReduce
  • E. HDInsight Spark cluster
Show Suggested Answer Hide Answer
Suggested Answer: B 🗳️
Storm runs topologies instead of the Apache Hadoop MapReduce jobs that you might be familiar with. Storm topologies are composed of multiple components that are arranged in a directed acyclic graph (DAG). Data flows between the components in the graph. Each component consumes one or more data streams, and can optionally emit one or more streams.
Python can be used to develop Storm components.
References:
https://docs.microsoft.com/en-us/azure/hdinsight/storm/apache-storm-overview

Comments

Chosen Answer:
This is a voting comment (?). It is better to Upvote an existing comment if you don't have anything to add.
Switch to a voting comment New
c1265
Highly Voted 5 years, 1 month ago
this really looks like spark, https://docs.microsoft.com/en-us/azure/hdinsight/spark/apache-spark-overview
upvoted 30 times
...
wyxh
Highly Voted 5 years ago
Spark clusters in HDInsight provide connectors for BI tools such as Power BI for data analytics. https://docs.microsoft.com/en-us/azure/hdinsight/spark/apache-spark-overview Storm processes streams of data in real time, and in the question is stated the data must be Processed every six hours
upvoted 21 times
...
GData23
Most Recent 3 years, 11 months ago
hadoop is dead. who cares
upvoted 4 times
...
davita8
3 years, 12 months ago
The answer is : "E "
upvoted 1 times
...
The answer is : "E "
upvoted 1 times
...
rjile
4 years, 1 month ago
The answer is : "E " Spark clusters in HDInsight provide connectors for BI tools such as Power BI for data analytics.
upvoted 3 times
...
vidray
4 years, 2 months ago
https://docs.microsoft.com/en-us/azure/hdinsight/spark/apache-spark-overview Answer is : HDInsight Spark cluster
upvoted 2 times
...
Shiva2
4 years, 3 months ago
I think it must be spark too. I don't think there is a direct configuration from power bi to storm. You will need to process data in storm pass the processed data to SQL server, then visualize. Here there is no mention of SQL server , hence answer must be E. Spark. https://docs.microsoft.com/en-us/azure/hdinsight/spark/apache-spark-use-bi-tools
upvoted 2 times
...
Prada
4 years, 3 months ago
It must be E. HDInsight Spark cluster due to Offer the ability to process data using solid-state drive (SSD) caching https://docs.microsoft.com/en-us/azure/hdinsight/spark/apache-spark-improve-performance-iocache
upvoted 1 times
...
syu31svc
4 years, 5 months ago
I would say E as per link https://docs.microsoft.com/en-us/azure/hdinsight/spark/apache-spark-overview: Read the parts on Caching on SSDs, The SparkContext connects to the Spark master and is responsible for converting an application to a directed graph (DAG) of individual tasks, REST APIs, Integration with BI Tools Python is mentioned as well
upvoted 4 times
dumpsm42
4 years, 4 months ago
hi to all, it's E mostly because of the power BI, the text says exactly "integrate with power BI" and in the microsoft documentatio for hd insight spark cluster => "...Integration with BI Tools Spark clusters in HDInsight provide connectors for BI tools such as Power BI for data analytics....", so for me it's E regards
upvoted 1 times
...
...
Sagja
4 years, 6 months ago
Answer is correct. Checkout this - https://docs.microsoft.com/en-us/azure/hdinsight/storm/apache-storm-overview. Python can also be used to develop Storm components. Create solutions in multiple languages: You can write Storm components in the language of your choice, such as Java, C#, and Python.
upvoted 1 times
...
M0e
4 years, 6 months ago
Storm does not provide Python (https://docs.microsoft.com/en-us/azure/architecture/data-guide/technology-choices/stream-processing) and is not used for 6hr batch processing (https://docs.microsoft.com/en-us/azure/architecture/data-guide/technology-choices/batch-processing). Spark supports all of the options. Hence, E is the correct answer.
upvoted 2 times
...
EYIT
4 years, 7 months ago
E. HDInsight Spark cluster https://docs.microsoft.com/en-us/azure/architecture/data-guide/technology-choices/stream-processing
upvoted 2 times
...
Arsa
4 years, 8 months ago
it should be Spark because of (DAG + PowerBI integration)
upvoted 3 times
...
rmk4ever
4 years, 9 months ago
Ans is HDInsight Spark cluster Caching on SSDs Integration with BI Tools Spark master is responsible for converting an application to a directed graph (DAG) REST API-based Spark job server to remotely submit and monitor job full reference: https://docs.microsoft.com/en-us/azure/hdinsight/spark/apache-spark-overview
upvoted 7 times
...
azrnovice
4 years, 10 months ago
Question says need python native support. Azure Storm don't support Python. Check out this comparison chart: https://docs.microsoft.com/en-us/azure/architecture/data-guide/technology-choices/stream-processing
upvoted 12 times
...
Luke97
4 years, 11 months ago
I think HDInsight Apache Spark should be the correct answer. 1. Offer interactive data analysis 2. Offer caching 3. DirectQuery (live connection) to PowerBI And Spark also use DAG too (https://docs.microsoft.com/en-au/azure/hdinsight/spark/apache-spark-streaming-exactly-once).
upvoted 8 times
...
Community vote distribution
A (35%)
C (25%)
B (20%)
Other
Most Voted
A voting comment increases the vote count for the chosen answer by one.

Upvoting a comment with a selected answer will also increase the vote count towards that answer by one. So if you see a comment that you already agree with, you can upvote it instead of posting a new comment.

SaveCancel
Loading ...
exam
Someone Bought Contributor Access for:
SY0-701
London, 1 minute ago