Exam DP-203 All Questions

View all questions & answers for the DP-203 exam

Exam DP-203 topic 2 question 90 discussion

Actual exam question from Microsoft's DP-203

Question #: 90
Topic #: 2

You have an Azure subscription that contains an Azure Synapse Analytics dedicated SQL pool named Pool1. Pool1 receives new data once every 24 hours.
You have the following function.

You have the following query.

The query is executed once every 15 minutes and the @parameter value is set to the current date.
You need to minimize the time it takes for the query to return results.
Which two actions should you perform? Each correct answer presents part of the solution.
NOTE: Each correct selection is worth one point.

A. Create an index on the avg_f column.
B. Convert the avg_c column into a calculated column.
C. Create an index on the sensorid column.
D. Enable result set caching.
E. Change the table distribution to replicate.

Show Suggested Answer

Suggested Answer: BD 🗳️

by anks84 at Sept. 7, 2022, 1:21 a.m.

Comments

Submit Cancel

esaade

Highly Voted 2 years, 4 months ago

Selected Answer: BD

B. Convert the avg_c column into a calculated column. D. Enable result set caching. Explanation: A calculated column is a column that uses an expression to calculate its value based on other columns in the same table. In this case, the udfFtoC function can be used to calculate the avg_c value based on the avg_temperature column, eliminating the need to call the UDF in the SELECT statement. Enabling result set caching can improve query performance by caching the result set of the query, so subsequent queries that use the same parameters can be retrieved from the cache instead of executing the query again. Creating an index on the avg_f column or the sensorid column is not useful because there are no join or filter conditions on these columns in the WHERE clause. Changing the table distribution to replicate is also not necessary because it does not affect the query performance in this scenario

upvoted 15 times

...

f2a9aa5

Most Recent 11 months, 3 weeks ago

No, Azure Synapse Analytics dedicated SQL pool does not support calculated (or computed) columns1. This means you cannot define columns in your table that automatically compute their values based on other columns in the same table. If you need to include calculated values, you can handle the calculations in your ETL (Extract, Transform, Load) process before loading the data into the dedicated SQL pool. Alternatively, you can perform the calculations in your queries when retrieving data from the table. https://learn.microsoft.com/en-us/azure/synapse-analytics/sql-data-warehouse/sql-data-warehouse-tables-overview#unsupported-table-features

upvoted 2 times

...

iceberge

11 months, 4 weeks ago

Copilot: (DC) Converting the avg_c column into a calculated column (Option B) might not be as effective for improving query performance in this scenario. Calculated columns can be useful for simplifying queries and ensuring consistent calculations, but they don’t necessarily improve performance, especially if the calculation is complex or if the column is frequently queried. In contrast, enabling result set caching (Option D) and creating an index on the sensorid column (Option C) directly target performance improvements by reducing query execution time and speeding up data retrieval

upvoted 1 times

...

AccountHatz

1 year, 4 months ago

Selected Answer: BE

I don't think D is a solution , "What's not cached .... Queries using user defined functions" from https://learn.microsoft.com/en-us/azure/synapse-analytics/sql-data-warehouse/performance-tuning-result-set-caching

upvoted 2 times

SajadAhm

11 months, 1 week ago

notice that the sink table in dedicated SQL pool gets updated every 24 hours, but this query(due to some reasons) runs every 15 minutes; as a result, for 24 * 4 - 1 times, there is not new result to show. so, if the data is already in cache, db engine can access them faster.

upvoted 1 times

...

dakku987

1 year, 6 months ago

Selected Answer: AC

chatgpt To minimize the time it takes for the query to return results, you should consider the following actions: A. Create an index on the avg_f column: Creating an index on the avg_f column can improve the query performance, especially if there are frequent searches or filtering based on this column. C. Create an index on the sensorid column: If the sensorid column is frequently used in filtering or joins, creating an index on this column can improve the query performance.

upvoted 2 times

...

OldSchool

1 year, 9 months ago

Selected Answer: DE

First the wording of the question is ridiculous. "Query is executed once every 15 minutes". So what is it, "Once" or "every 15 minutes"? Either way, they are asking what to do to speed up the query. D Setting result caching E Replicated distribution

upvoted 1 times

...

kkk5566

1 year, 10 months ago

Selected Answer: DE

correct

upvoted 1 times

...

kkk5566

1 year, 10 months ago

Selected Answer: BD

correct

upvoted 1 times

kkk5566

1 year, 10 months ago

D & E should be correct

upvoted 1 times

...

Matt2000

1 year, 11 months ago

Calculated columns exist in Power BI, not dedicated SQL pools. Computed columns are not supported in dedicated SQL pools. Ref: https://learn.microsoft.com/en-us/azure/synapse-analytics/sql-data-warehouse/sql-data-warehouse-tables-overview

upvoted 4 times

...

dumbled

2 years, 2 months ago

Selected Answer: BD

correct

upvoted 2 times

...

Lestrang

2 years, 6 months ago

Selected Answer: AB

With that point by erhard being made (caching does work with queries using UDF), the most commonly voted D is wrong, so B and what now? Replicated cannot be right because it received date everyday and has aggregations so not a dim table and we have no clue about its size. by elimination that leaves us A and C Indexing is less useful with no joins but it does improve some performance being on where clause target. so I'd go with A and B.

upvoted 1 times

Lestrang

2 years, 6 months ago

Creating an index on the avg_f column will improve the performance of the query, as it will allow the query to find the relevant data more quickly. Converting the avg_c column into a calculated column will allow the query to return the temperature in Celsius without the need to perform the calculation at runtime, which will also improve the performance of the query.

upvoted 1 times

Lestrang

2 years, 5 months ago

After re-considering, I am unsure whether the indexing would help. That would only leave Replication as the viable option even though it is not viable design but the request is to minimize query time and that is what it will do, so I guess final answer is BE

upvoted 1 times

...

AccountHatz

1 year, 4 months ago

"2 GB is not a hard limit. If the data is static and does not change, you can replicate larger tables." https://learn.microsoft.com/en-us/azure/synapse-analytics/sql-data-warehouse/design-guidance-for-replicated-tables

upvoted 1 times

...

Karforcerts

2 years, 7 months ago

Selected Answer: BD

need to first chage UDF to a calculated column and then enable result set caching. agreed with the answer

upvoted 4 times

...

erhard

2 years, 8 months ago

Queries using user defined functions are not cached. https://learn.microsoft.com/en-us/azure/synapse-analytics/sql-data-warehouse/performance-tuning-result-set-caching

upvoted 4 times

...

kl8585

2 years, 8 months ago

Selected Answer: DE

A,C not right since index don't help if join are not involved. D for sure help query performance. I don't get why B: "A computed column is a virtual column whose value is calculated from other values in the table. By default, the expression’s outputted value is not physically stored. Instead, SQL Server runs the expression when the column is queried and returns the value as part of the result set ... In many cases, non-persistent computed columns put too much burden on the processor, resulting in SLOWER QUERIES and unresponsive applications" Since the only requirements is faster execution times for queries, i don't think calculated columns will improve performance. Si second option for me would be D (replicate). Although it will cause more effort writing, because updates should be written to every partition, optimized writes aren't a requirement in the question.

upvoted 3 times

...

rzeng

2 years, 8 months ago

pool ingest data once per 24 hrs, while query happens every 15mins, caching result can definitely avoid the some duplicate calculation, I'll go with BD.

upvoted 1 times

...

Xinyuehong

2 years, 9 months ago

Selected Answer: DE

I think should be DE. since "the query is executed once every 15 minutes and the @parameter value is set to the current date", and the it receives new data once every 24 hours, it means the query result isn't change in one day even you run it every 15 mins. The data is static within a day. Replication could help the performance.

upvoted 2 times

...

anks84

2 years, 10 months ago

Selected Answer: BD

Answer is Correct !

upvoted 4 times

...