Exam DP-203 All Questions

View all questions & answers for the DP-203 exam

Exam DP-203 topic 2 question 44 discussion

Actual exam question from Microsoft's DP-203

Question #: 44
Topic #: 2

Note: This question is part of a series of questions that present the same scenario. Each question in the series contains a unique solution that might meet the stated goals. Some question sets might have more than one correct solution, while others might not have a correct solution.
After you answer a question in this section, you will NOT be able to return to it. As a result, these questions will not appear in the review screen.
You plan to create an Azure Databricks workspace that has a tiered structure. The workspace will contain the following three workloads:
✑ A workload for data engineers who will use Python and SQL.
✑ A workload for jobs that will run notebooks that use Python, Scala, and SQL.
✑ A workload that data scientists will use to perform ad hoc analysis in Scala and R.
The enterprise architecture team at your company identifies the following standards for Databricks environments:
✑ The data engineers must share a cluster.
✑ The job cluster will be managed by using a request process whereby data scientists and data engineers provide packaged notebooks for deployment to the cluster.
✑ All the data scientists must be assigned their own cluster that terminates automatically after 120 minutes of inactivity. Currently, there are three data scientists.
You need to create the Databricks clusters for the workloads.
Solution: You create a Standard cluster for each data scientist, a Standard cluster for the data engineers, and a High Concurrency cluster for the jobs.
Does this meet the goal?

A. Yes
B. No

Show Suggested Answer

Suggested Answer: B 🗳️

by alexleonvalencia at Dec. 10, 2021, 2:01 a.m.

Comments

Submit Cancel

lukeonline

Highly Voted 2 years, 9 months ago

Selected Answer: B

B is correct but the explanation is wrong. ✑ A workload for data engineers who will use Python and SQL. --> high concurrency ✑ A workload for jobs that will run notebooks that use Python, Scala, and SQL. --> standard ✑ A workload that data scientists will use to perform ad hoc analysis in Scala and R. --> standard because high concurrency does not support Scala https://stackoverflow.com/questions/65869399/high-concurrency-clusters-in-databricks

upvoted 40 times

mav2000

8 months, 1 week ago

B is correct, but there's an update, high concurrency now supports scala https://learn.microsoft.com/en-us/azure/databricks/compute/configure#--high-concurrency-clusters - **Data scientist**: Cada Data scientist tendrá su propio cluster, lo que significa que con un **standard** con la configuración de autoterminate y ya están bien. - **Data engineer**: los Data engineer deberian usar un **High Concurrency** cluster, porque son muchos usuarios. - **Jobs**: esta bien que los jobs usen **High concurrency** porque muchos jobs podrian estar ejecutandose en el cluster y ahora high concurrency si soporta scala

upvoted 2 times

...

Gikan

9 months ago

You do not need to use High concurrency, because "Scala code will be executed inside the Spark JVM (per machine) that is shared between all users": https://learn.microsoft.com/en-us/answers/questions/924587/azure-databricks-scala-on-high-concurrency-cluster

upvoted 1 times

...

kamil_k

2 years, 7 months ago

or rather Scala does not support concurrent instances (but yes, it implies HC cluster will not support Scala)

upvoted 2 times

...

d046bc0

Most Recent 10 months, 2 weeks ago

Standard is enough for all workloads. High concurrency (due to Scala )possible only for data engineers

upvoted 2 times

...

Momoanwar

10 months, 3 weeks ago

Correct, chatgpt : For the given scenario, where data engineers must share a cluster, data scientists need their own clusters with auto-termination, and a managed job cluster is required for running notebooks, the solution provided may not fully meet the goal. Here's why: - Data engineers should share a cluster, so creating a single Standard cluster for all data engineers would meet this requirement. - For data scientists, the solution suggests a Standard cluster for each, but it should specify that these clusters have auto-termination settings configured to minimize costs. - The High Concurrency cluster is suitable for running jobs because it allows multiple users to share the cluster and run jobs concurrently. However, it should be managed as per the enterprise team's standards. The provided solution does not fully adhere to these standards, especially regarding the auto-termination requirement for data scientists' clusters. Thus, the answer would be: B. No, the solution does not meet the goal.

upvoted 1 times

...

kkk5566

1 year, 1 month ago

Selected Answer: B

high concurrency does not support Scala

upvoted 2 times

...

akhil5432

1 year, 2 months ago

Selected Answer: B

Correct option is B-NO

upvoted 1 times

...

Rossana

1 year, 6 months ago

A)Yes The use of a shared Standard cluster for data engineers, a High Concurrency cluster for jobs, and individual Standard clusters for each data scientist that auto-terminates after 120 minutes of inactivity aligns with the specified standards and is a valid approach for creating a tiered Databricks workspace.

upvoted 1 times

...

kckalahasthi

1 year, 10 months ago

https://docs.databricks.com/clusters/configure.html

upvoted 1 times

...

Igor85

1 year, 11 months ago

high concurrency cluster is already a legacy cluster mode. question is not relevant anymore

upvoted 2 times

...

greenlever

2 years ago

Selected Answer: A

Standard mode can be shared by multiple users and terminate automatically, on the other hand High do not terminate automatically and Scala workload is not supported.

upvoted 2 times

...

Babu99

2 years, 1 month ago

NO IS CORRECT ANSWER

upvoted 1 times

...

Deeksha1234

2 years, 2 months ago

correct , answer B, agree with lukeonline

upvoted 1 times

...

mkthoma3

2 years, 4 months ago

https://docs.microsoft.com/en-us/azure/databricks/clusters/configure

upvoted 1 times

...

Hanse

2 years, 7 months ago

As per Link: https://docs.azuredatabricks.net/clusters/configure.html Standard and Single Node clusters terminate automatically after 120 minutes by default. --> Data Scientists High Concurrency clusters do not terminate automatically by default. A Standard cluster is recommended for a single user. --> Standard for Data Scientists & High Concurrency for Data Engineers Standard clusters can run workloads developed in any language: Python, SQL, R, and Scala. High Concurrency clusters can run workloads developed in SQL, Python, and R. The performance and security of High Concurrency clusters is provided by running user code in separate processes, which is not possible in Scala. --> Jobs needs Standard

upvoted 3 times

...

bad_atitude

2 years, 10 months ago

B is correct

upvoted 2 times

...

alexleonvalencia

2 years, 10 months ago

Selected Answer: B

Respuesta correcta; Standar para Cientificos y jobs. Alta concurrencia para ingenieros de datos.

upvoted 3 times

Sanand

2 years, 10 months ago

Agree! - Correct answer; Standard for Scientists and jobs. High concurrency for data engineers.

upvoted 3 times

...

Exam DP-203 All Questions

View all questions & answers for the DP-203 exam

Exam DP-203 topic 2 question 44 discussion

Comments

lukeonline

mav2000

Gikan

kamil_k

d046bc0

Momoanwar

kkk5566

akhil5432

Rossana

kckalahasthi

Igor85

greenlever

Babu99

Deeksha1234

mkthoma3

Hanse

bad_atitude

alexleonvalencia

Sanand

SY0-701