Exam Certified Data Engineer Associate All Questions

View all questions & answers for the Certified Data Engineer Associate exam

Exam Certified Data Engineer Associate topic 1 question 39 discussion

Actual exam question from Databricks's Certified Data Engineer Associate

Question #: 39
Topic #: 1

[All Certified Data Engineer Associate Questions]

A data analysis team has noticed that their Databricks SQL queries are running too slowly when connected to their always-on SQL endpoint. They claim that this issue is present when many members of the team are running small queries simultaneously. They ask the data engineering team for help. The data engineering team notices that each of the team’s queries uses the same SQL endpoint.
Which of the following approaches can the data engineering team use to improve the latency of the team’s queries?

A. They can increase the cluster size of the SQL endpoint.
B. They can increase the maximum bound of the SQL endpoint’s scaling range.
C. They can turn on the Auto Stop feature for the SQL endpoint.
D. They can turn on the Serverless feature for the SQL endpoint.
E. They can turn on the Serverless feature for the SQL endpoint and change the Spot Instance Policy to “Reliability Optimized.”

Show Suggested Answer

Suggested Answer: B 🗳️

by XiltroX at April 2, 2023, 7:24 p.m.

Comments

Submit Cancel

damaldon

Highly Voted 1 year, 10 months ago

Answer is B. According to databricks documentation: -Sequentially -> Increase cluster size -Concurrent --> Scale out cluster

upvoted 31 times

...

mokrani

Highly Voted 1 year, 8 months ago

Answer B is correct For those who's selected the same answer as the question 40 in the Databricks exam training, be careful becaue it's quite different: - Here the question is about simultaneously runs -> Scale Out clusters (involves adding more clusters) - In the Databricks exam training, the question is about "sequentially run queries" -> Scale Up (increasing the size of the nodes) Pleas refer to the this accepted answer https://community.databricks.com/t5/data-engineering/sequential-vs-concurrency-optimization-questions-from-query/td-p/36696

upvoted 17 times

...

andie123

Most Recent 6 months ago

Selected Answer: A

When many users are running small queries simultaneously on a SQL warehouse (prior: endpoint), the database can become overloaded, causing slow query execution times. By increasing the cluster size of the SQL warehouse, the database can handle more simultaneous queries, resulting in faster query execution times. -> A

upvoted 1 times

...

806e7d2

8 months ago

Selected Answer: B

The issue described is related to query latency when multiple users are running queries simultaneously, all using the same SQL endpoint. This often leads to contention for resources, causing delays in query processing. To address this, the maximum scaling range of the SQL endpoint can be increased, which allows the endpoint to dynamically scale and handle more concurrent queries by adding more resources (e.g., additional nodes) as needed. In Databricks SQL, SQL endpoints can be scaled horizontally (adding more nodes) to better handle concurrency. By increasing the maximum scaling range, the endpoint will be able to scale more aggressively during periods of high load, improving query performance for concurrent users.

upvoted 2 times

...

MohdAltaf19

9 months, 4 weeks ago

Correct Answers B Through put > Sequential > Scale Up Performance > Concurrent > Scale Out

upvoted 2 times

...

7a22144

11 months ago

B is correct because increasing the maximum bound of the SQL endpoint’s scaling range allows the endpoint to handle a larger number of queries by automatically scaling up the resources (e.g., adding more clusters). This approach addresses the issue of slow queries due to high concurrent usage, as more resources will become available to handle the increased load from simultaneous queries.

upvoted 2 times

...

benni_ale

1 year, 2 months ago

Selected Answer: B

simultaneously probably means concurrently so scaling out the cluster is better

upvoted 1 times

...

sakis213

1 year, 3 months ago

Selected Answer: B

B is correct

upvoted 1 times

...

niharam2021

1 year, 5 months ago

upvoted 2 times

...

agAshish

1 year, 5 months ago

Answer is A , Q40 -- https://files.training.databricks.com/assessments/practice-exams/PracticeExam-DataEngineerAssociate.pdf

upvoted 3 times

avidlearner

5 months ago

In that Question they mention that the endpoint is not being used by any other user, which means it's a problem of scaling up, because queries from a single user are not performing which means it needs more processing power(vertical scaling). Hence the approach there would be to increase the cluster size, but in the above question the problem is there are several small queries run by multiple users, problem of concurrency which requires scaling out the cluster. In sql endpoint configuration , Scaling option you can mention min and max clusters. Which is a way to scale out your cluster by adding more units.

upvoted 1 times

...

6aa83ae

10 months, 1 week ago

differenct question

upvoted 1 times

...

K_yamini

1 year, 5 months ago

the question on Practice set is slightly different if you look closely :-In the first scenario, the data analyst notes slow query performance for sequentially run queries on a SQL endpoint that isn't shared with other users. This suggests that the problem may be related to the configuration or performance of the SQL endpoint itself rather than contention with other users. In the second scenario, the data analysis team experiences slow query performance when multiple team members are running queries simultaneously on the same SQL endpoint. This indicates potential resource contention or limitations on the SQL endpoint when handling concurrent queries from multiple users. Given these differences, the approaches to address the issues may also differ:

upvoted 2 times

...

Nika12

1 year, 5 months ago

Selected Answer: B

Just got 100% on the exam. B was correct. Also, here is the link to good explanation: https://docs.databricks.com/en/compute/cluster-config-best-practices.html

upvoted 6 times

...

Ody__

1 year, 6 months ago

Selected Answer: A

A is correct

upvoted 1 times

...

Ody__

1 year, 6 months ago

Selected Answer: A

correct answer is A Question 40: https://files.training.databricks.com/assessments/practice-exams/PracticeExam-DataEngineerAssociate.pdf

upvoted 2 times

AdamNowak

6 months, 3 weeks ago

the question is about concurrent small queries this one in pdf is about sequential

upvoted 1 times

...

CommanderBigMac

10 months, 1 week ago

Completely different question

upvoted 1 times

...

SerGrey

1 year, 6 months ago

Selected Answer: B

B is correct

upvoted 2 times

...

nedlo

1 year, 7 months ago

Selected Answer: B

its B because its "simultanously by many users" so you have to scale it horizontally by increasing number of nodes : https://community.databricks.com/t5/data-engineering/sequential-vs-concurrency-optimization-questions-from-query/td-p/36696

upvoted 3 times

...

pc1337xd

1 year, 8 months ago

Selected Answer: B

Issues occur when too many users are running queries at the same time -> Increase scaling so more clusters handle the queries

upvoted 5 times

...

god_father

1 year, 8 months ago

Selected Answer: B

Increasing cluster size is for vertical scalability of query execution, while scaling out cluster is for horizontal scalability of query execution

upvoted 2 times

...

Load full discussion...