Exam DP-203 All Questions

View all questions & answers for the DP-203 exam

Exam DP-203 topic 2 question 2 discussion

Actual exam question from Microsoft's DP-203

Question #: 2
Topic #: 2

A company has a real-time data analysis solution that is hosted on Microsoft Azure. The solution uses Azure Event Hub to ingest data and an Azure Stream
Analytics cloud job to analyze the data. The cloud job is configured to use 120 Streaming Units (SU).
You need to optimize performance for the Azure Stream Analytics job.
Which two actions should you perform? Each correct answer presents part of the solution.
NOTE: Each correct selection is worth one point.

A. Implement event ordering.
B. Implement Azure Stream Analytics user-defined functions (UDF).
C. Implement query parallelization by partitioning the data output.
D. Scale the SU count for the job up.
E. Scale the SU count for the job down.
F. Implement query parallelization by partitioning the data input.

Show Suggested Answer

Suggested Answer: CF 🗳️

by [deleted] at Aug. 27, 2021, 11:10 a.m.

Comments

Submit Cancel

manquak

Highly Voted 3 years, 7 months ago

Partition input and output. REF: https://docs.microsoft.com/en-us/azure/stream-analytics/stream-analytics-parallelization

upvoted 70 times

kolakone

3 years, 7 months ago

Agree. And partitioning Input and output with same number of partitions gives the best performance optimization..

upvoted 14 times

...

Lio95

Highly Voted 3 years, 7 months ago

No event consumer was mentioned. Therefore, partitioning output is not relevant. Answer is correct

upvoted 15 times

Boompiee

2 years, 11 months ago

The stream analytics job is the consumer.

upvoted 2 times

...

nicolas1999

3 years, 5 months ago

Stream analytics ALWAYS has at least one output. There is no need to mention that. So correct answer is input and output

upvoted 4 times

...

sakis213

Most Recent 3 months ago

Selected Answer: CD

Partitioning the output data can improve write performance or manage how results are distributed, but it doesn’t directly impact the performance of data ingestion or processing within the job itself.

upvoted 1 times

...

GiuseppeTanda

3 months, 3 weeks ago

Selected Answer: CF

As suggested in the following link: https://techcommunity.microsoft.com/blog/analyticsonazure/optimize-your-stream-analytics-job%E2%80%99s-performance-using-job-diagram-simulator/3652303 "One way to optimize a Stream Analytics job’s performance is to leverage parallelism in query" and "For a job to be parallel, you need to align partition keys between all inputs, query steps, and outputs"

upvoted 2 times

...

de_examtopics

4 months, 4 weeks ago

Selected Answer: CF

C. Implement query parallelization by partitioning the data output. By partitioning the data output, you can enable query parallelization. This allows the queries to be distributed across multiple partitions, which can significantly enhance performance. F. Implement query parallelization by partitioning the data input.

upvoted 1 times

...

Danweo

9 months, 2 weeks ago

Selected Answer: DF

Partitioning on both input and output can help, but we don't know if the output is a service that doesn't support partitioning like Power BI. Scaling up will always assign more resources at least.

upvoted 1 times

...

e56bb91

9 months, 3 weeks ago

Selected Answer: CF

ChatGPT 4o C. Implement query parallelization by partitioning the data output: Output Partitioning: By partitioning the data output, you can ensure that the processing load is distributed evenly across multiple nodes, which can significantly improve performance by reducing bottlenecks in data writing. F. Implement query parallelization by partitioning the data input: Input Partitioning: Partitioning the data input allows the Stream Analytics job to process different partitions in parallel, leading to better utilization of the available streaming units and improved throughput.

upvoted 2 times

...

Dusica

12 months ago

C and F > same partitions > embarrassingly parallel processing

upvoted 1 times

...

Dusica

1 year ago

It says optimize performance, does not say that it is bad so adding SU may be unneccesary cost increase. Parallelization and embarrassingly parallel job is correct

upvoted 1 times

...

Bhargava12

1 year ago

Answer is D & F

upvoted 1 times

...

Elanche

1 year, 1 month ago

D. Scale the SU count for the job up: Increasing the number of Streaming Units (SUs) can improve the performance of the Stream Analytics job by providing more processing power to handle the incoming data stream. C. Implement query parallelization by partitioning the data output: Partitioning the data output can help distribute the processing load across multiple partitions, allowing for parallel execution of queries and enhancing performance.

upvoted 1 times

...

Alongi

1 year, 2 months ago

Selected Answer: CD

C and D

upvoted 1 times

...

prshntdxt7

1 year, 3 months ago

Selected Answer: CD

C. Implement query parallelization by partitioning the data output: "Partitioning lets you divide data into subsets based on a partition key. If your input (for example Event Hubs) is partitioned by a key, it's highly recommended to specify this partition key when adding input to your Stream Analytics job. Scaling a Stream Analytics job takes advantage of partitions in the input and output. A Stream Analytics job can consume and write different partitions in parallel, which increases throughput." D. Scale the SU count for the job up: "The total number of streaming units that can be used by a Stream Analytics job depends on the number of steps in the query defined for the job and the number of partitions for each step... All non-partitioned steps together can scale up to one streaming unit (SU V2s) for a Stream Analytics job. In addition, you can add 1 SU V2 for each partition in a partitioned step."

upvoted 1 times

...

sdg2844

1 year, 3 months ago

Selected Answer: CF

As there is no indication of any query parallelization currently, we have to choose to parallelize for both input and output as the first/correct answers.

upvoted 2 times

...

Khadija10

1 year, 3 months ago

Selected Answer: CF

Partitioning lets you divide data into subsets based on a partition key. If your input (for example Event Hubs) is partitioned by a key, it's highly recommended to specify this partition key when adding input to your Stream Analytics job. Scaling a Stream Analytics job takes advantage of partitions in the input and output. A Stream Analytics job can consume and write different partitions in parallel, which increases throughput. Ref: https://learn.microsoft.com/en-us/azure/stream-analytics/stream-analytics-parallelization

upvoted 2 times

...

jongert

1 year, 4 months ago

Selected Answer: CF

An embarrassingly parallel job allows the highest degree of parallelization. Looking at how the max number of stream units is calculated, it would not be useful to scale them up if you keep a bottleneck at the output. Unsure what a good reference value would be for the number of SUs, but 120 does not seem very low to me.

upvoted 2 times

...

d046bc0

1 year, 4 months ago

Selected Answer: CF

Scale the SU count for the job up - (ChatGPT) This will not necessarily improve the performance of your job, unless your query is CPU-bound or memory-bound. Scaling up the SU count will increase the amount of resources available for your job, but it will also increase the cost. You should first try to optimize your query by using parallelization and repartitioning techniques, and then scale up the SU count only if needed1

upvoted 1 times

...

Load full discussion...

Exam DP-203 All Questions

View all questions & answers for the DP-203 exam

Exam DP-203 topic 2 question 2 discussion

Comments

manquak

kolakone

Lio95

Boompiee

nicolas1999

sakis213

GiuseppeTanda

de_examtopics

Danweo

e56bb91

Dusica

Dusica

Bhargava12

Elanche

Alongi

prshntdxt7

sdg2844

Khadija10

jongert

d046bc0

SY0-701