exam questions

Exam DP-203 All Questions

View all questions & answers for the DP-203 exam

Exam DP-203 topic 2 question 90 discussion

Actual exam question from Microsoft's DP-203
Question #: 90
Topic #: 2
[All DP-203 Questions]

You have an Azure subscription that contains an Azure Synapse Analytics dedicated SQL pool named Pool1. Pool1 receives new data once every 24 hours.
You have the following function.

You have the following query.

The query is executed once every 15 minutes and the @parameter value is set to the current date.
You need to minimize the time it takes for the query to return results.
Which two actions should you perform? Each correct answer presents part of the solution.
NOTE: Each correct selection is worth one point.

  • A. Create an index on the avg_f column.
  • B. Convert the avg_c column into a calculated column.
  • C. Create an index on the sensorid column.
  • D. Enable result set caching.
  • E. Change the table distribution to replicate.
Show Suggested Answer Hide Answer
Suggested Answer: BD 🗳️

Comments

Chosen Answer:
This is a voting comment (?). It is better to Upvote an existing comment if you don't have anything to add.
Switch to a voting comment New
esaade
Highly Voted 2 years, 1 month ago
Selected Answer: BD
B. Convert the avg_c column into a calculated column. D. Enable result set caching. Explanation: A calculated column is a column that uses an expression to calculate its value based on other columns in the same table. In this case, the udfFtoC function can be used to calculate the avg_c value based on the avg_temperature column, eliminating the need to call the UDF in the SELECT statement. Enabling result set caching can improve query performance by caching the result set of the query, so subsequent queries that use the same parameters can be retrieved from the cache instead of executing the query again. Creating an index on the avg_f column or the sensorid column is not useful because there are no join or filter conditions on these columns in the WHERE clause. Changing the table distribution to replicate is also not necessary because it does not affect the query performance in this scenario
upvoted 15 times
...
f2a9aa5
Most Recent 8 months, 3 weeks ago
No, Azure Synapse Analytics dedicated SQL pool does not support calculated (or computed) columns1. This means you cannot define columns in your table that automatically compute their values based on other columns in the same table. If you need to include calculated values, you can handle the calculations in your ETL (Extract, Transform, Load) process before loading the data into the dedicated SQL pool. Alternatively, you can perform the calculations in your queries when retrieving data from the table. https://learn.microsoft.com/en-us/azure/synapse-analytics/sql-data-warehouse/sql-data-warehouse-tables-overview#unsupported-table-features
upvoted 2 times
...
iceberge
8 months, 4 weeks ago
Copilot: (DC) Converting the avg_c column into a calculated column (Option B) might not be as effective for improving query performance in this scenario. Calculated columns can be useful for simplifying queries and ensuring consistent calculations, but they don’t necessarily improve performance, especially if the calculation is complex or if the column is frequently queried. In contrast, enabling result set caching (Option D) and creating an index on the sensorid column (Option C) directly target performance improvements by reducing query execution time and speeding up data retrieval
upvoted 1 times
...
AccountHatz
1 year, 1 month ago
Selected Answer: BE
I don't think D is a solution , "What's not cached .... Queries using user defined functions" from https://learn.microsoft.com/en-us/azure/synapse-analytics/sql-data-warehouse/performance-tuning-result-set-caching
upvoted 2 times
SajadAhm
8 months, 1 week ago
notice that the sink table in dedicated SQL pool gets updated every 24 hours, but this query(due to some reasons) runs every 15 minutes; as a result, for 24 * 4 - 1 times, there is not new result to show. so, if the data is already in cache, db engine can access them faster.
upvoted 1 times
...
...
dakku987
1 year, 3 months ago
Selected Answer: AC
chatgpt To minimize the time it takes for the query to return results, you should consider the following actions: A. Create an index on the avg_f column: Creating an index on the avg_f column can improve the query performance, especially if there are frequent searches or filtering based on this column. C. Create an index on the sensorid column: If the sensorid column is frequently used in filtering or joins, creating an index on this column can improve the query performance.
upvoted 2 times
...
OldSchool
1 year, 6 months ago
Selected Answer: DE
First the wording of the question is ridiculous. "Query is executed once every 15 minutes". So what is it, "Once" or "every 15 minutes"? Either way, they are asking what to do to speed up the query. D Setting result caching E Replicated distribution
upvoted 1 times
...
kkk5566
1 year, 7 months ago
Selected Answer: DE
correct
upvoted 1 times
...
kkk5566
1 year, 7 months ago
Selected Answer: BD
correct
upvoted 1 times
kkk5566
1 year, 7 months ago
D & E should be correct
upvoted 1 times
...
...
Matt2000
1 year, 8 months ago
Calculated columns exist in Power BI, not dedicated SQL pools. Computed columns are not supported in dedicated SQL pools. Ref: https://learn.microsoft.com/en-us/azure/synapse-analytics/sql-data-warehouse/sql-data-warehouse-tables-overview
upvoted 4 times
...
dumbled
1 year, 11 months ago
Selected Answer: BD
correct
upvoted 2 times
...
Lestrang
2 years, 3 months ago
Selected Answer: AB
With that point by erhard being made (caching does work with queries using UDF), the most commonly voted D is wrong, so B and what now? Replicated cannot be right because it received date everyday and has aggregations so not a dim table and we have no clue about its size. by elimination that leaves us A and C Indexing is less useful with no joins but it does improve some performance being on where clause target. so I'd go with A and B.
upvoted 1 times
Lestrang
2 years, 3 months ago
Creating an index on the avg_f column will improve the performance of the query, as it will allow the query to find the relevant data more quickly. Converting the avg_c column into a calculated column will allow the query to return the temperature in Celsius without the need to perform the calculation at runtime, which will also improve the performance of the query.
upvoted 1 times
Lestrang
2 years, 2 months ago
After re-considering, I am unsure whether the indexing would help. That would only leave Replication as the viable option even though it is not viable design but the request is to minimize query time and that is what it will do, so I guess final answer is BE
upvoted 1 times
...
...
AccountHatz
1 year, 1 month ago
"2 GB is not a hard limit. If the data is static and does not change, you can replicate larger tables." https://learn.microsoft.com/en-us/azure/synapse-analytics/sql-data-warehouse/design-guidance-for-replicated-tables
upvoted 1 times
...
...
Karforcerts
2 years, 4 months ago
Selected Answer: BD
need to first chage UDF to a calculated column and then enable result set caching. agreed with the answer
upvoted 4 times
...
erhard
2 years, 5 months ago
Queries using user defined functions are not cached. https://learn.microsoft.com/en-us/azure/synapse-analytics/sql-data-warehouse/performance-tuning-result-set-caching
upvoted 4 times
...
kl8585
2 years, 5 months ago
Selected Answer: DE
A,C not right since index don't help if join are not involved. D for sure help query performance. I don't get why B: "A computed column is a virtual column whose value is calculated from other values in the table. By default, the expression’s outputted value is not physically stored. Instead, SQL Server runs the expression when the column is queried and returns the value as part of the result set ... In many cases, non-persistent computed columns put too much burden on the processor, resulting in SLOWER QUERIES and unresponsive applications" Since the only requirements is faster execution times for queries, i don't think calculated columns will improve performance. Si second option for me would be D (replicate). Although it will cause more effort writing, because updates should be written to every partition, optimized writes aren't a requirement in the question.
upvoted 3 times
...
rzeng
2 years, 5 months ago
pool ingest data once per 24 hrs, while query happens every 15mins, caching result can definitely avoid the some duplicate calculation, I'll go with BD.
upvoted 1 times
...
Xinyuehong
2 years, 6 months ago
Selected Answer: DE
I think should be DE. since "the query is executed once every 15 minutes and the @parameter value is set to the current date", and the it receives new data once every 24 hours, it means the query result isn't change in one day even you run it every 15 mins. The data is static within a day. Replication could help the performance.
upvoted 2 times
...
anks84
2 years, 7 months ago
Selected Answer: BD
Answer is Correct !
upvoted 4 times
...
Community vote distribution
A (35%)
C (25%)
B (20%)
Other
Most Voted
A voting comment increases the vote count for the chosen answer by one.

Upvoting a comment with a selected answer will also increase the vote count towards that answer by one. So if you see a comment that you already agree with, you can upvote it instead of posting a new comment.

SaveCancel
Loading ...
exam
Someone Bought Contributor Access for:
SY0-701
London, 1 minute ago