exam questions

Exam DP-203 All Questions

View all questions & answers for the DP-203 exam

Exam DP-203 topic 2 question 44 discussion

Actual exam question from Microsoft's DP-203
Question #: 44
Topic #: 2
[All DP-203 Questions]

Note: This question is part of a series of questions that present the same scenario. Each question in the series contains a unique solution that might meet the stated goals. Some question sets might have more than one correct solution, while others might not have a correct solution.
After you answer a question in this section, you will NOT be able to return to it. As a result, these questions will not appear in the review screen.
You plan to create an Azure Databricks workspace that has a tiered structure. The workspace will contain the following three workloads:
✑ A workload for data engineers who will use Python and SQL.
✑ A workload for jobs that will run notebooks that use Python, Scala, and SQL.
✑ A workload that data scientists will use to perform ad hoc analysis in Scala and R.
The enterprise architecture team at your company identifies the following standards for Databricks environments:
✑ The data engineers must share a cluster.
✑ The job cluster will be managed by using a request process whereby data scientists and data engineers provide packaged notebooks for deployment to the cluster.
✑ All the data scientists must be assigned their own cluster that terminates automatically after 120 minutes of inactivity. Currently, there are three data scientists.
You need to create the Databricks clusters for the workloads.
Solution: You create a Standard cluster for each data scientist, a Standard cluster for the data engineers, and a High Concurrency cluster for the jobs.
Does this meet the goal?

  • A. Yes
  • B. No
Show Suggested Answer Hide Answer
Suggested Answer: B 🗳️

Comments

Chosen Answer:
This is a voting comment (?). It is better to Upvote an existing comment if you don't have anything to add.
Switch to a voting comment New
lukeonline
Highly Voted 2 years, 9 months ago
Selected Answer: B
B is correct but the explanation is wrong. ✑ A workload for data engineers who will use Python and SQL. --> high concurrency ✑ A workload for jobs that will run notebooks that use Python, Scala, and SQL. --> standard ✑ A workload that data scientists will use to perform ad hoc analysis in Scala and R. --> standard because high concurrency does not support Scala https://stackoverflow.com/questions/65869399/high-concurrency-clusters-in-databricks
upvoted 40 times
mav2000
8 months, 1 week ago
B is correct, but there's an update, high concurrency now supports scala https://learn.microsoft.com/en-us/azure/databricks/compute/configure#--high-concurrency-clusters - **Data scientist**: Cada Data scientist tendrá su propio cluster, lo que significa que con un **standard** con la configuración de autoterminate y ya están bien. - **Data engineer**: los Data engineer deberian usar un **High Concurrency** cluster, porque son muchos usuarios. - **Jobs**: esta bien que los jobs usen **High concurrency** porque muchos jobs podrian estar ejecutandose en el cluster y ahora high concurrency si soporta scala
upvoted 2 times
...
Gikan
9 months ago
You do not need to use High concurrency, because "Scala code will be executed inside the Spark JVM (per machine) that is shared between all users": https://learn.microsoft.com/en-us/answers/questions/924587/azure-databricks-scala-on-high-concurrency-cluster
upvoted 1 times
...
kamil_k
2 years, 7 months ago
or rather Scala does not support concurrent instances (but yes, it implies HC cluster will not support Scala)
upvoted 2 times
...
...
d046bc0
Most Recent 10 months, 2 weeks ago
Standard is enough for all workloads. High concurrency (due to Scala )possible only for data engineers
upvoted 2 times
...
Momoanwar
10 months, 3 weeks ago
Correct, chatgpt : For the given scenario, where data engineers must share a cluster, data scientists need their own clusters with auto-termination, and a managed job cluster is required for running notebooks, the solution provided may not fully meet the goal. Here's why: - Data engineers should share a cluster, so creating a single Standard cluster for all data engineers would meet this requirement. - For data scientists, the solution suggests a Standard cluster for each, but it should specify that these clusters have auto-termination settings configured to minimize costs. - The High Concurrency cluster is suitable for running jobs because it allows multiple users to share the cluster and run jobs concurrently. However, it should be managed as per the enterprise team's standards. The provided solution does not fully adhere to these standards, especially regarding the auto-termination requirement for data scientists' clusters. Thus, the answer would be: B. No, the solution does not meet the goal.
upvoted 1 times
...
kkk5566
1 year, 1 month ago
Selected Answer: B
high concurrency does not support Scala
upvoted 2 times
...
akhil5432
1 year, 2 months ago
Selected Answer: B
Correct option is B-NO
upvoted 1 times
...
Rossana
1 year, 6 months ago
A)Yes The use of a shared Standard cluster for data engineers, a High Concurrency cluster for jobs, and individual Standard clusters for each data scientist that auto-terminates after 120 minutes of inactivity aligns with the specified standards and is a valid approach for creating a tiered Databricks workspace.
upvoted 1 times
...
kckalahasthi
1 year, 10 months ago
https://docs.databricks.com/clusters/configure.html
upvoted 1 times
...
Igor85
1 year, 11 months ago
high concurrency cluster is already a legacy cluster mode. question is not relevant anymore
upvoted 2 times
...
greenlever
2 years ago
Selected Answer: A
Standard mode can be shared by multiple users and terminate automatically, on the other hand High do not terminate automatically and Scala workload is not supported.
upvoted 2 times
...
Babu99
2 years, 1 month ago
NO IS CORRECT ANSWER
upvoted 1 times
...
Deeksha1234
2 years, 2 months ago
correct , answer B, agree with lukeonline
upvoted 1 times
...
mkthoma3
2 years, 4 months ago
https://docs.microsoft.com/en-us/azure/databricks/clusters/configure
upvoted 1 times
...
Hanse
2 years, 7 months ago
As per Link: https://docs.azuredatabricks.net/clusters/configure.html Standard and Single Node clusters terminate automatically after 120 minutes by default. --> Data Scientists High Concurrency clusters do not terminate automatically by default. A Standard cluster is recommended for a single user. --> Standard for Data Scientists & High Concurrency for Data Engineers Standard clusters can run workloads developed in any language: Python, SQL, R, and Scala. High Concurrency clusters can run workloads developed in SQL, Python, and R. The performance and security of High Concurrency clusters is provided by running user code in separate processes, which is not possible in Scala. --> Jobs needs Standard
upvoted 3 times
...
bad_atitude
2 years, 10 months ago
B is correct
upvoted 2 times
...
alexleonvalencia
2 years, 10 months ago
Selected Answer: B
Respuesta correcta; Standar para Cientificos y jobs. Alta concurrencia para ingenieros de datos.
upvoted 3 times
Sanand
2 years, 10 months ago
Agree! - Correct answer; Standard for Scientists and jobs. High concurrency for data engineers.
upvoted 3 times
...
...
Community vote distribution
A (35%)
C (25%)
B (20%)
Other
Most Voted
A voting comment increases the vote count for the chosen answer by one.

Upvoting a comment with a selected answer will also increase the vote count towards that answer by one. So if you see a comment that you already agree with, you can upvote it instead of posting a new comment.

SaveCancel
Loading ...
exam
Someone Bought Contributor Access for:
SY0-701
London, 1 minute ago