exam questions

Exam DP-203 All Questions

View all questions & answers for the DP-203 exam

Exam DP-203 topic 2 question 2 discussion

Actual exam question from Microsoft's DP-203
Question #: 2
Topic #: 2
[All DP-203 Questions]

A company has a real-time data analysis solution that is hosted on Microsoft Azure. The solution uses Azure Event Hub to ingest data and an Azure Stream
Analytics cloud job to analyze the data. The cloud job is configured to use 120 Streaming Units (SU).
You need to optimize performance for the Azure Stream Analytics job.
Which two actions should you perform? Each correct answer presents part of the solution.
NOTE: Each correct selection is worth one point.

  • A. Implement event ordering.
  • B. Implement Azure Stream Analytics user-defined functions (UDF).
  • C. Implement query parallelization by partitioning the data output.
  • D. Scale the SU count for the job up.
  • E. Scale the SU count for the job down.
  • F. Implement query parallelization by partitioning the data input.
Show Suggested Answer Hide Answer
Suggested Answer: CF 🗳️

Comments

Chosen Answer:
This is a voting comment (?). It is better to Upvote an existing comment if you don't have anything to add.
Switch to a voting comment New
manquak
Highly Voted 3 years, 7 months ago
Partition input and output. REF: https://docs.microsoft.com/en-us/azure/stream-analytics/stream-analytics-parallelization
upvoted 70 times
kolakone
3 years, 7 months ago
Agree. And partitioning Input and output with same number of partitions gives the best performance optimization..
upvoted 14 times
...
...
Lio95
Highly Voted 3 years, 7 months ago
No event consumer was mentioned. Therefore, partitioning output is not relevant. Answer is correct
upvoted 15 times
Boompiee
2 years, 11 months ago
The stream analytics job is the consumer.
upvoted 2 times
...
nicolas1999
3 years, 5 months ago
Stream analytics ALWAYS has at least one output. There is no need to mention that. So correct answer is input and output
upvoted 4 times
...
...
sakis213
Most Recent 3 months ago
Selected Answer: CD
Partitioning the output data can improve write performance or manage how results are distributed, but it doesn’t directly impact the performance of data ingestion or processing within the job itself.
upvoted 1 times
...
GiuseppeTanda
3 months, 3 weeks ago
Selected Answer: CF
As suggested in the following link: https://techcommunity.microsoft.com/blog/analyticsonazure/optimize-your-stream-analytics-job%E2%80%99s-performance-using-job-diagram-simulator/3652303 "One way to optimize a Stream Analytics job’s performance is to leverage parallelism in query" and "For a job to be parallel, you need to align partition keys between all inputs, query steps, and outputs"
upvoted 2 times
...
de_examtopics
4 months, 4 weeks ago
Selected Answer: CF
C. Implement query parallelization by partitioning the data output. By partitioning the data output, you can enable query parallelization. This allows the queries to be distributed across multiple partitions, which can significantly enhance performance. F. Implement query parallelization by partitioning the data input.
upvoted 1 times
...
Danweo
9 months, 2 weeks ago
Selected Answer: DF
Partitioning on both input and output can help, but we don't know if the output is a service that doesn't support partitioning like Power BI. Scaling up will always assign more resources at least.
upvoted 1 times
...
e56bb91
9 months, 3 weeks ago
Selected Answer: CF
ChatGPT 4o C. Implement query parallelization by partitioning the data output: Output Partitioning: By partitioning the data output, you can ensure that the processing load is distributed evenly across multiple nodes, which can significantly improve performance by reducing bottlenecks in data writing. F. Implement query parallelization by partitioning the data input: Input Partitioning: Partitioning the data input allows the Stream Analytics job to process different partitions in parallel, leading to better utilization of the available streaming units and improved throughput.
upvoted 2 times
...
Dusica
12 months ago
C and F > same partitions > embarrassingly parallel processing
upvoted 1 times
...
Dusica
1 year ago
It says optimize performance, does not say that it is bad so adding SU may be unneccesary cost increase. Parallelization and embarrassingly parallel job is correct
upvoted 1 times
...
Bhargava12
1 year ago
Answer is D & F
upvoted 1 times
...
Elanche
1 year, 1 month ago
D. Scale the SU count for the job up: Increasing the number of Streaming Units (SUs) can improve the performance of the Stream Analytics job by providing more processing power to handle the incoming data stream. C. Implement query parallelization by partitioning the data output: Partitioning the data output can help distribute the processing load across multiple partitions, allowing for parallel execution of queries and enhancing performance.
upvoted 1 times
...
Alongi
1 year, 2 months ago
Selected Answer: CD
C and D
upvoted 1 times
...
prshntdxt7
1 year, 3 months ago
Selected Answer: CD
C. Implement query parallelization by partitioning the data output: "Partitioning lets you divide data into subsets based on a partition key. If your input (for example Event Hubs) is partitioned by a key, it's highly recommended to specify this partition key when adding input to your Stream Analytics job. Scaling a Stream Analytics job takes advantage of partitions in the input and output. A Stream Analytics job can consume and write different partitions in parallel, which increases throughput." D. Scale the SU count for the job up: "The total number of streaming units that can be used by a Stream Analytics job depends on the number of steps in the query defined for the job and the number of partitions for each step... All non-partitioned steps together can scale up to one streaming unit (SU V2s) for a Stream Analytics job. In addition, you can add 1 SU V2 for each partition in a partitioned step."
upvoted 1 times
...
sdg2844
1 year, 3 months ago
Selected Answer: CF
As there is no indication of any query parallelization currently, we have to choose to parallelize for both input and output as the first/correct answers.
upvoted 2 times
...
Khadija10
1 year, 3 months ago
Selected Answer: CF
Partitioning lets you divide data into subsets based on a partition key. If your input (for example Event Hubs) is partitioned by a key, it's highly recommended to specify this partition key when adding input to your Stream Analytics job. Scaling a Stream Analytics job takes advantage of partitions in the input and output. A Stream Analytics job can consume and write different partitions in parallel, which increases throughput. Ref: https://learn.microsoft.com/en-us/azure/stream-analytics/stream-analytics-parallelization
upvoted 2 times
...
jongert
1 year, 4 months ago
Selected Answer: CF
An embarrassingly parallel job allows the highest degree of parallelization. Looking at how the max number of stream units is calculated, it would not be useful to scale them up if you keep a bottleneck at the output. Unsure what a good reference value would be for the number of SUs, but 120 does not seem very low to me.
upvoted 2 times
...
d046bc0
1 year, 4 months ago
Selected Answer: CF
Scale the SU count for the job up - (ChatGPT) This will not necessarily improve the performance of your job, unless your query is CPU-bound or memory-bound. Scaling up the SU count will increase the amount of resources available for your job, but it will also increase the cost. You should first try to optimize your query by using parallelization and repartitioning techniques, and then scale up the SU count only if needed1
upvoted 1 times
...
Community vote distribution
A (35%)
C (25%)
B (20%)
Other
Most Voted
A voting comment increases the vote count for the chosen answer by one.

Upvoting a comment with a selected answer will also increase the vote count towards that answer by one. So if you see a comment that you already agree with, you can upvote it instead of posting a new comment.

SaveCancel
Loading ...
exam
Someone Bought Contributor Access for:
SY0-701
London, 1 minute ago