Exam DP-600 All Questions

View all questions & answers for the DP-600 exam

Exam DP-600 topic 1 question 5 discussion

Actual exam question from Microsoft's DP-600

Question #: 5
Topic #: 1

HOTSPOT -

Case study -
This is a case study. Case studies are not timed separately. You can use as much exam time as you would like to complete each case. However, there may be additional case studies and sections on this exam. You must manage your time to ensure that you are able to complete all questions included on this exam in the time provided.
To answer the questions included in a case study, you will need to reference information that is provided in the case study. Case studies might contain exhibits and other resources that provide more information about the scenario that is described in the case study. Each question is independent of the other questions in this case study.
At the end of this case study, a review screen will appear. This screen allows you to review your answers and to make changes before you move to the next section of the exam. After you begin a new section, you cannot return to this section.

To start the case study -
To display the first question in this case study, click the Next button. Use the buttons in the left pane to explore the content of the case study before you answer the questions. Clicking these buttons displays information such as business requirements, existing environment, and problem statements. If the case study has an All Information tab, note that the information displayed is identical to the information displayed on the subsequent tabs. When you are ready to answer a question, click the Question button to return to the question.

Overview -
Litware, Inc. is a manufacturing company that has offices throughout North America. The analytics team at Litware contains data engineers, analytics engineers, data analysts, and data scientists.

Existing Environment -

Fabric Environment -
Litware has been using a Microsoft Power BI tenant for three years. Litware has NOT enabled any Fabric capacities and features.

Available Data -
Litware has data that must be analyzed as shown in the following table.

The Product data contains a single table and the following columns.

The customer satisfaction data contains the following tables:

Survey -

Question -

Response -
For each survey submitted, the following occurs:
One row is added to the Survey table.
One row is added to the Response table for each question in the survey.
The Question table contains the text of each survey question. The third question in each survey response is an overall satisfaction score. Customers can submit a survey after each purchase.

User Problems -
The analytics team has large volumes of data, some of which is semi-structured. The team wants to use Fabric to create a new data store.
Product data is often classified into three pricing groups: high, medium, and low. This logic is implemented in several databases and semantic models, but the logic does NOT always match across implementations.

Requirements -

Planned Changes -
Litware plans to enable Fabric features in the existing tenant. The analytics team will create a new data store as a proof of concept (PoC). The remaining Liware users will only get access to the Fabric features once the PoC is complete. The PoC will be completed by using a Fabric trial capacity
The following three workspaces will be created:
AnalyticsPOC: Will contain the data store, semantic models, reports pipelines, dataflow, and notebooks used to populate the data store
DataEngPOC: Will contain all the pipelines, dataflows, and notebooks used to populate OneLake
DataSciPOC: Will contain all the notebooks and reports created by the data scientists
The following will be created in the AnalyticsPOC workspace:
A data store (type to be decided)

A custom semantic model -

A default semantic model -

Interactive reports -
The data engineers will create data pipelines to load data to OneLake either hourly or daily depending on the data source. The analytics engineers will create processes to ingest, transform, and load the data to the data store in the AnalyticsPOC workspace daily. Whenever possible, the data engineers will use low-code tools for data ingestion. The choice of which data cleansing and transformation tools to use will be at the data engineers’ discretion.
All the semantic models and reports in the Analytics POC workspace will use the data store as the sole data source.

Technical Requirements -
The data store must support the following:
Read access by using T-SQL or Python
Semi-structured and unstructured data
Row-level security (RLS) for users executing T-SQL queries
Files loaded by the data engineers to OneLake will be stored in the Parquet format and will meet Delta Lake specifications.
Data will be loaded without transformation in one area of the AnalyticsPOC data store. The data will then be cleansed, merged, and transformed into a dimensional model
The data load process must ensure that the raw and cleansed data is updated completely before populating the dimensional model
The dimensional model must contain a date dimension. There is no existing data source for the date dimension. The Litware fiscal year matches the calendar year. The date dimension must always contain dates from 2010 through the end of the current year.
The product pricing group logic must be maintained by the analytics engineers in a single location. The pricing group data must be made available in the data store for T-SOL. queries and in the default semantic model. The following logic must be used:
List prices that are less than or equal to 50 are in the low pricing group.
List prices that are greater than 50 and less than or equal to 1,000 are in the medium pricing group.
List prices that are greater than 1,000 are in the high pricing group.

Security Requirements -
Only Fabric administrators and the analytics team must be able to see the Fabric items created as part of the PoC.
Litware identifies the following security requirements for the Fabric items in the AnalyticsPOC workspace:
Fabric administrators will be the workspace administrators.
The data engineers must be able to read from and write to the data store. No access must be granted to datasets or reports.
The analytics engineers must be able to read from, write to, and create schemas in the data store. They also must be able to create and share semantic models with the data analysts and view and modify all reports in the workspace.
The data scientists must be able to read from the data store, but not write to it. They will access the data by using a Spark notebook
The data analysts must have read access to only the dimensional model objects in the data store. They also must have access to create Power BI reports by using the semantic models created by the analytics engineers.
The date dimension must be available to all users of the data store.
The principle of least privilege must be followed.
Both the default and custom semantic models must include only tables or views from the dimensional model in the data store. Litware already has the following Microsoft Entra security groups:
FabricAdmins: Fabric administrators
AnalyticsTeam: All the members of the analytics team
DataAnalysts: The data analysts on the analytics team
DataScientists: The data scientists on the analytics team
DataEngineers: The data engineers on the analytics team
AnalyticsEngineers: The analytics engineers on the analytics team

Report Requirements -
The data analysts must create a customer satisfaction report that meets the following requirements:
Enables a user to select a product to filter customer survey responses to only those who have purchased that product.
Displays the average overall satisfaction score of all the surveys submitted during the last 12 months up to a selected dat.
Shows data as soon as the data is updated in the data store.
Ensures that the report and the semantic model only contain data from the current and previous year.
Ensures that the report respects any table-level security specified in the source data store.
Minimizes the execution time of report queries.
You need to assign permissions for the data store in the AnalyticsPOC workspace. The solution must meet the security requirements.
Which additional permissions should you assign when you share the data store? To answer, select the appropriate options in the answer area.
NOTE: Each correct selection is worth one point.

Show Suggested Answer

Suggested Answer:

by XiltroX at Feb. 12, 2024, 4:19 p.m.

Comments

Submit Cancel

Bharat

Highly Voted 1 year, 2 months ago

Here is my take on it: Data Engineers: Read all Apache Spark - because they need to be able to work with Spark for Data curation. Data Analysts: Build Reports on the default dataset - because they are report builders Data Scientists: Read SQL Endpoints - They leverage curated data (by engineers) to do predictive analytics. Let me know what you think.

upvoted 86 times

scorradi

3 months, 3 weeks ago

My guest based in the question description: Data analysts --> Build Reports on the default dataset. Data Scientist -> Spark, because they need to use Notebooks as the question says. Data Engineers -> Spark, because the need to write data.

upvoted 2 times

...

Training_be2

4 months, 1 week ago

Data analysts --> Build Reports on the default dataset Data Scientist will access the data through a Notebook --> Read all Apache Spark Data Engineers will read and write to the data store (lakehouse) --> Read all Apache Spark SQL Analytics Endpoint is read-only mode. You can analyze Delta tables using T-SQL (instead of Spark SQL), save functions, or create views. However, you cannot write data to a Delta table.

upvoted 3 times

...

7d97b62

10 months ago

I think Data Analysts Building Reports on the default dataset is correct

upvoted 1 times

...

janineh

10 months, 2 weeks ago

it says the data scientist need to use spark - so spark for them in my opinion

upvoted 5 times

...

Load full discussion...

...

vissu_settipally

Highly Voted 1 year, 2 months ago

Data Engineers = Read sql endpoints Data Analyst = Build Reports Data Scientist = They prefer Spark Always

upvoted 38 times

semauni

5 months, 3 weeks ago

agreed for the analyst and scientist, but the engineers also need spark access since they will be making notebooks

upvoted 1 times

SVCDIA

2 months, 3 weeks ago

engineers will be making notebooks in DataEngPOC workspace but not in AnalyticsPOC workspace. So they don't need to create any in that

upvoted 1 times

...

Rakesh16

Most Recent 5 months, 1 week ago

Data Engineers-->Read all Apache spark Data Analysts-->Build reports on the default dataset Data Scientists-->Read all Apache spark

upvoted 12 times

...

Richdata23

5 months, 2 weeks ago

DataEngineers: "Read All Apache Spark" Explanation: To load and manage data in OneLake, data engineers need the ability to read and write data in Spark-compatible formats, which is ideal for handling semi-structured and unstructured data. DataAnalysts: "Build Reports on the default dataset" Explanation: Data analysts need to create Power BI reports using semantic models. This permission allows them to interact with the necessary datasets without granting access to perform data transformations. DataScientists: "Read All Apache Spark" Explanation: Data scientists will use Spark notebooks to access and analyze data. By granting "Read All Apache Spark," they can directly interact with the data in their Spark environment, which is optimized for their analysis needs.

upvoted 8 times

...

Naqib

5 months, 3 weeks ago

DataEngineers: ReadAll Apache Spark DataAnalyst: Build Reports on the default dataset DataScientist: ReadAll Apache Spark The DS and DE need to have access to Apache Spark otherwise they wont able to work with the data transformation/curation ect.

upvoted 3 times

...

Ahmadpbi

9 months, 1 week ago

as the provided data (The data engineers must be able to read from and write to the data store. No access must be granted to datasets or reports.), according to this It would not be possible for them to build report using the default dataset.

upvoted 1 times

...

vish9

11 months, 1 week ago

Data Analysts – Build Reports Data Scientists – Read All Apache Spark. The confusion is about the Data Engineers: They must be able to read and write to the data store. The SQL end point is read only. They should not have build reports. Hence the remaining option is read all Apache spark. Hence in this question no one gets access to the SQL Endpoint.

upvoted 10 times

...

2dc6125

11 months, 2 weeks ago

i think this is part of a solutio of the security requirement and data engineer cannot perform write action with any of these permission so i think Data Engineers: Read SQL Endpoints Data Analysts: Build Reports on the default dataset Data Scientists: Read all Apache Spark

upvoted 7 times

...

stilferx

11 months, 3 weeks ago

IMHO, - DEs - Spark, - Analysts - Build Report, - DS - Spark (as well). Why? 1. Here is the description of the options: https://blog.fabric.microsoft.com/en-us/blog/data-warehouse-sharing/ 2. Here is the role description in the question itself: The data engineers must be able to !!!read from and write!!! to the data store. No access must be granted to datasets or reports. The data scientists must be able to read from the data store, but not write to it. They will access the data by using a !!! Spark notebook The data analysts must have read access to only the dimensional model objects in the data store. They also must have access to create Power BI reports by !!! using the semantic models created by the analytics engineers. The date dimension must be available to all users of the data store.

upvoted 14 times

...

rmeng

12 months ago

DataEngineers: ReadAll Apache Spark DataAnalyst: Build Reports on the default dataset DataScientist: ReadAll Apache Spark

upvoted 9 times

...

CertPeople

1 year, 1 month ago

"The data scientists must be able to read from the data store, but not write to it. They will access the data by using a Spark notebook" so for data scientists we have to give Read All Apache Spark

upvoted 5 times

...

a_51

1 year, 1 month ago

Feeding the question to ChatGPT it says given the option of A, B and C respectively: Data Engineers: Option B: Read All Apache Spark (Allows reading data from Apache Spark notebooks) Option C: Read All SQL analytics endpoint data (Allows reading data from SQL analytics endpoint) Data Scientists: Option B: Read All Apache Spark (Allows reading data from Apache Spark notebooks) Data Analysts: Option A: Build Reports on the default dataset (Allows creating reports)

upvoted 2 times

dp600

1 year ago

stop using chatgpt as a source, it's not trustworthy

upvoted 11 times

...

thuss

1 year, 2 months ago

My take: the SQL Analytics Endpoint is read-only, so it's perfect for the DataScientist. DataEngineers need read and write, so Spark. The Analysts need the report capabilities obviously.

upvoted 3 times

momo1165

10 months, 1 week ago

SQL Endpoints do not support Spark Notebooks

upvoted 1 times

...

PazaBIandData

1 year ago

All of these are read-only. It confuses me

upvoted 4 times

...

David_Webb

1 year, 2 months ago

DataEngineers: ReadAll Apache Spark DataAnalyst: Build Reports on the default dataset DataScientist: ReadAll Apache Spark Data engineers will use Pyspark in notebooks to transform data from the Files folder in the Lakehouse. Data analysts will build reports and dashboards from the prepared dataset. Data scientists will use MLLib in notebooks to build models. The "Read All SQL analytics endpoint data" should be for the AnalyticsEngineers. The analytics team has four types of members.

upvoted 10 times

metiii

1 year, 1 month ago

This is about access in AnalyticsPOC, Data Scientists don't need to access Apache Spark in this workspace they should only be able to read from SQL endpoint, they will create Spark notebooks in their own workspace and this question is not concerned about that workspace.

upvoted 1 times

c8f5bdf

11 months, 4 weeks ago

actually it says that data scientists will use notebooks so they need ReadAll Apache Spark

upvoted 1 times

...

wojciech_wie

1 year, 2 months ago

Data Engineers = ReadAll Apache Spark Data Analyst = Build Reports Data Scientist = ReadAll Apache Spark

upvoted 5 times

...

SamuComqi

1 year, 2 months ago

Data Engineers: Read all SQL Analytics Endpoint data (use SQL to explore and create/modify tables, views, stored procedures). Data Analysts: Build reports on the default dataset (using Power BI). Data Scientists: Read all Apache Spark (they will use Notebooks to analyze data and apply ML models).

upvoted 4 times

...

Momoanwar

1 year, 2 months ago

Engineers = spark Analyst = report Scientist = Endpoint

upvoted 5 times

...

Load full discussion...

Exam DP-600 All Questions

View all questions & answers for the DP-600 exam

Exam DP-600 topic 1 question 5 discussion

Comments

Bharat

scorradi

Training_be2

7d97b62

janineh

vissu_settipally

semauni

SVCDIA

Rakesh16

Richdata23

Naqib

Ahmadpbi

vish9

2dc6125

stilferx

rmeng

CertPeople

a_51

dp600

thuss

momo1165

PazaBIandData

David_Webb

metiii

c8f5bdf

wojciech_wie

SamuComqi

Momoanwar

SY0-701