Exam Certified Data Engineer Associate All Questions

View all questions & answers for the Certified Data Engineer Associate exam

Exam Certified Data Engineer Associate topic 1 question 82 discussion

Actual exam question from Databricks's Certified Data Engineer Associate

Question #: 82
Topic #: 1

[All Certified Data Engineer Associate Questions]

A data engineering team has noticed that their Databricks SQL queries are running too slowly when they are submitted to a non-running SQL endpoint. The data engineering team wants this issue to be resolved.

Which of the following approaches can the team use to reduce the time it takes to return results in this scenario?

A. They can turn on the Serverless feature for the SQL endpoint and change the Spot Instance Policy to "Reliability Optimized."
B. They can turn on the Auto Stop feature for the SQL endpoint.
C. They can increase the cluster size of the SQL endpoint.
D. They can turn on the Serverless feature for the SQL endpoint.
E. They can increase the maximum bound of the SQL endpoint's scaling range.

Show Suggested Answer

Suggested Answer: D 🗳️

by meow_akk at Oct. 22, 2023, 5:52 a.m.

Comments

Submit Cancel

carpa_jo

Highly Voted 10 months, 3 weeks ago

Selected Answer: D

The important point of this scenario is "when they are submitted to a non-running SQL endpoint". So its not about increasing the instance size or the amount of instances to improve the query performance, but its about reducing the start-up time. A: Not possible, serverless can't be combined with spot instance policies, see https://docs.databricks.com/en/compute/sql-warehouse/serverless.html#limitations B: Auto Stop is about terminating a SQL warehouse after x minutes of being idle. C: Increasing the cluster size provides more capacities for running queries, but doesn't reduce start-up time. D: Serverless reduces start-up time from minutes to seconds. Jackpot! E: Increasing the max bound of the SQL endpoints scaling range will help with lots of sequencial queries, which is not the case here.

upvoted 18 times

...

azure_bimonster

Most Recent 10 months, 1 week ago

Selected Answer: D

D is correct. Key phrase is "submitted to a non-running SQL endpoint". Increasing cluster size is not going to help if that's in a state like non-running.

upvoted 1 times

...

bartfto

10 months, 3 weeks ago

Selected Answer: D

"when they are submitted to a non-running SQL endpoint" ANSWER D

upvoted 1 times

...

Garyn

11 months ago

Selected Answer: C

C. They can increase the cluster size of the SQL endpoint. Explanation: Increasing the cluster size of the SQL endpoint can enhance query performance by providing more computational resources to execute queries. This can potentially speed up query processing by allowing more parallelism, handling larger workloads, and reducing the time taken for query execution.

upvoted 1 times

...

AndreFR

11 months, 1 week ago

key word, “non-running SQL endpoint” which implies that the query is slow because the cluster needs time to get started. I suggest answer D because : A : Serverless & spot instances cannot be mixed ? B : autostop means that jobs are submitted to non-running SQL endpoints C : increasing the clustersize can compensate for slow startup time D : serverless is able to start and scale faster than non-running SQL endpoints (seconds intead of minutes) E : increasing maximum bound will help only if there are simultaneous queries https://docs.gcp.databricks.com/en/lakehouse-architecture/cost-optimization/best-practices.html#use-serverless-for-your-workloads

upvoted 4 times

...

olaru

11 months, 2 weeks ago

Selected Answer: E

maximum bound of the SQL endpoint's scaling range

upvoted 2 times

...

nedlo

11 months, 3 weeks ago

Selected Answer: C

D is wrong - its already Serverless (non running SQL endpoint) how would turning Serverless ON help? They also says C here https://community.databricks.com/t5/data-engineering/when-to-increase-maximum-bound-vs-when-to-increase-cluster-size/td-p/27880 . E is only true for autoscaling clusters

upvoted 2 times

...

msengupta

12 months ago

Selected Answer: C

https://community.databricks.com/t5/data-engineering/sql-query-takes-too-long-to-run/td-p/21884

upvoted 2 times

...

Syd

1 year ago

Answer E: https://www.databricks.com/blog/2022/03/10/top-5-databricks-performance-tips.html

upvoted 2 times

Syd

1 year ago

I mean answer C

upvoted 1 times

...

meow_akk

1 year, 1 month ago

Ans E : you re welcome :) https://community.databricks.com/t5/data-engineering/when-to-increase-maximum-bound-vs-when-to-increase-cluster-size/td-p/27880

upvoted 1 times

mike_stewart

1 year, 1 month ago

I don't agree. Your answer is only valid when 'sequential' is mentioned, which is not the case here.

upvoted 1 times

...

Exam Certified Data Engineer Associate All Questions

View all questions & answers for the Certified Data Engineer Associate exam

Exam Certified Data Engineer Associate topic 1 question 82 discussion

Comments

carpa_jo

azure_bimonster

bartfto

Garyn

AndreFR

olaru

nedlo

msengupta

Syd

Syd

meow_akk

mike_stewart

Get IT Certification

New Version GCP Professional Cloud Architect Certificate & Helpful Information

The 5 Most In-Demand Project Management Certifications of 2019