exam questions

Exam DP-600 All Questions

View all questions & answers for the DP-600 exam

Exam DP-600 topic 1 question 3 discussion

Actual exam question from Microsoft's DP-600
Question #: 3
Topic #: 1
[All DP-600 Questions]

Case study -
This is a case study. Case studies are not timed separately. You can use as much exam time as you would like to complete each case. However, there may be additional case studies and sections on this exam. You must manage your time to ensure that you are able to complete all questions included on this exam in the time provided.
To answer the questions included in a case study, you will need to reference information that is provided in the case study. Case studies might contain exhibits and other resources that provide more information about the scenario that is described in the case study. Each question is independent of the other questions in this case study.
At the end of this case study, a review screen will appear. This screen allows you to review your answers and to make changes before you move to the next section of the exam. After you begin a new section, you cannot return to this section.

To start the case study -
To display the first question in this case study, click the Next button. Use the buttons in the left pane to explore the content of the case study before you answer the questions. Clicking these buttons displays information such as business requirements, existing environment, and problem statements. If the case study has an All Information tab, note that the information displayed is identical to the information displayed on the subsequent tabs. When you are ready to answer a question, click the Question button to return to the question.

Overview -
Contoso, Ltd. is a US-based health supplements company. Contoso has two divisions named Sales and Research. The Sales division contains two departments named Online Sales and Retail Sales. The Research division assigns internally developed product lines to individual teams of researchers and analysts.

Existing Environment -

Identity Environment -
Contoso has a Microsoft Entra tenant named contoso.com. The tenant contains two groups named ResearchReviewersGroup1 and ResearchReviewersGroup2.

Data Environment -
Contoso has the following data environment:
The Sales division uses a Microsoft Power BI Premium capacity.
The semantic model of the Online Sales department includes a fact table named Orders that uses Import made. In the system of origin, the OrderID value represents the sequence in which orders are created.
The Research department uses an on-premises, third-party data warehousing product.
Fabric is enabled for contoso.com.
An Azure Data Lake Storage Gen2 storage account named storage1 contains Research division data for a product line named Productline1. The data is in the delta format.
A Data Lake Storage Gen2 storage account named storage2 contains Research division data for a product line named Productline2. The data is in the CSV format.

Requirements -

Planned Changes -
Contoso plans to make the following changes:
Enable support for Fabric in the Power BI Premium capacity used by the Sales division.
Make all the data for the Sales division and the Research division available in Fabric.
For the Research division, create two Fabric workspaces named Productline1ws and Productine2ws.
In Productline1ws, create a lakehouse named Lakehouse1.
In Lakehouse1, create a shortcut to storage1 named ResearchProduct.

Data Analytics Requirements -
Contoso identifies the following data analytics requirements:
All the workspaces for the Sales division and the Research division must support all Fabric experiences.
The Research division workspaces must use a dedicated, on-demand capacity that has per-minute billing.
The Research division workspaces must be grouped together logically to support OneLake data hub filtering based on the department name.
For the Research division workspaces, the members of ResearchReviewersGroup1 must be able to read lakehouse and warehouse data and shortcuts by using SQL endpoints.
For the Research division workspaces, the members of ResearchReviewersGroup2 must be able to read lakehouse data by using Lakehouse explorer.
All the semantic models and reports for the Research division must use version control that supports branching.

Data Preparation Requirements -
Contoso identifies the following data preparation requirements:
The Research division data for Productline1 must be retrieved from Lakehouse1 by using Fabric notebooks.
All the Research division data in the lakehouses must be presented as managed tables in Lakehouse explorer.

Semantic Model Requirements -
Contoso identifies the following requirements for implementing and managing semantic models:
The number of rows added to the Orders table during refreshes must be minimized.
The semantic models in the Research division workspaces must use Direct Lake mode.

General Requirements -
Contoso identifies the following high-level requirements that must be considered for all solutions:
Follow the principle of least privilege when applicable.
Minimize implementation and maintenance effort when possible.
You need to refresh the Orders table of the Online Sales department. The solution must meet the semantic model requirements.
What should you include in the solution?

  • A. an Azure Data Factory pipeline that executes a Stored procedure activity to retrieve the maximum value of the OrderID column in the destination lakehouse
  • B. an Azure Data Factory pipeline that executes a Stored procedure activity to retrieve the minimum value of the OrderID column in the destination lakehouse
  • C. an Azure Data Factory pipeline that executes a dataflow to retrieve the minimum value of the OrderID column in the destination lakehouse
  • D. an Azure Data Factory pipeline that executes a dataflow to retrieve the maximum value of the OrderID column in the destination lakehouse
Show Suggested Answer Hide Answer
Suggested Answer: D 🗳️

Comments

Chosen Answer:
This is a voting comment (?). It is better to Upvote an existing comment if you don't have anything to add.
Switch to a voting comment New
theseon
Highly Voted 1 year, 2 months ago
Selected Answer: D
we need to retrieve the maximum OrderID in the destination table to minimize the number of rows added during refresh. this would be an incremental load. can be done with data flows
upvoted 26 times
AsitTrivedi
1 year ago
https://learn.microsoft.com/en-au/fabric/data-factory/tutorial-setup-incremental-refresh-with-dataflows-gen2
upvoted 5 times
...
sraakesh95
1 year, 2 months ago
Totally agree on the max value to be retrieved on incremental load
upvoted 3 times
...
...
Jons123son
Highly Voted 11 months ago
Selected Answer: D
D - As other people pointed out, the exact same use case for retrieving the max OrderID is showcased in the documentation https://learn.microsoft.com/en-us/fabric/data-factory/tutorial-setup-incremental-refresh-with-dataflows-gen2#add-a-query-to-the-dataflow-to-filter-the-data-based-on-the-data-destination Thought at first that A would be correct because SP support least privilege and because how real incremental refresh is not yet supported in data flow gen 2 https://ideas.fabric.microsoft.com/ideas/idea/?ideaid=4814b098-efff-ed11-a81c-6045bdb98602
upvoted 9 times
...
Egocentric
Most Recent 3 months, 2 weeks ago
Selected Answer: D
key word minimize maintenance effort. answer is D
upvoted 2 times
...
NRezgui
4 months ago
Selected Answer: D
an Azure Data Factory pipeline that executes a dataflow to retrieve the maximum value of the OrderID column in the destination lakehouse
upvoted 1 times
...
NRezgui
4 months ago
Selected Answer: D
an Azure Data Factory pipeline that executes a dataflow to retrieve the maximum value of the OrderID column in the destination lakehouse
upvoted 1 times
...
NRezgui
4 months ago
Selected Answer: D
an Azure Data Factory pipeline that executes a dataflow to retrieve the maximum value of the OrderID column in the destination lakehouse
upvoted 1 times
...
Rakesh16
5 months, 2 weeks ago
Selected Answer: D
an Azure Data Factory pipeline that executes a dataflow to retrieve the maximum value of the OrderID column in the destination lakehouse https://learn.microsoft.com/en-au/fabric/data-factory/tutorial-setup-incremental-refresh-with-dataflows-gen2
upvoted 1 times
...
Naqib
5 months, 3 weeks ago
Both dataflow and SP should work is it? This question a bit confusing.
upvoted 1 times
...
semauni
5 months, 4 weeks ago
Selected Answer: D
I'm also choosing D alongside the other answers. My reasoning is: 1) The showcased example of doing incremental refresh by dataflows (see the link below), which is almost an answer in itself because it tells you how Microsoft views the solution to this issue. 2) Maximum ID instead of minimum: see the same link for the specific use. But even without this knowledge you can read in the case study that new (higher) numbers represent newer orders, so for an incremental refresh it makes way more sense to retrieve the ID of the *latest* order placed than the ID of the first. 3) dataflow instead of stored procedure: because of the link, but it also makes sense from the "minimize implementation and maintenance effort" requirement: writing an incremental refresh SP is very, very complicated. Link: https://learn.microsoft.com/en-au/fabric/data-factory/tutorial-setup-incremental-refresh-with-dataflows-gen2
upvoted 1 times
...
Egocentric
6 months, 1 week ago
D is the answer also A can be correct
upvoted 1 times
...
AzurePart
7 months ago
D https://learn.microsoft.com/en-au/fabric/data-factory/tutorial-setup-incremental-refresh-with-dataflows-gen2 -> "You now have a query that returns the maximum OrderID in the lakehouse. This query is used to filter the data from the OData source. The next section adds a query to the dataflow to filter the data from the OData source based on the maximum OrderID in the lakehouse." Don't ask why The problem is Fabric, so find the answer in the document Is this the first time you've seen a test in your life?
upvoted 1 times
...
LasAnsias
7 months, 1 week ago
Selected Answer: A
Azure Data Factory "pipelines" is different from Azure Data Factory "Data Flows". All the options are directing us to use Azure Data Factory "pipelines", so it should be using a stored procedure.
upvoted 2 times
semauni
5 months, 4 weeks ago
How does this piece of information impacts your answer? Because a pipeline is just the trigger for an activity to happen - which can either be a Stored Procedure activity or a dataflow activity. What limitation do pipelines have for dataflows in this regard?
upvoted 1 times
...
...
sepiida
7 months, 1 week ago
Selected Answer: A
we need to retrieve the maximum OrderID in the destination table to minimize the number of rows added during refresh. This can be achieved with both the dataflow and a stored procedure. It mentions that "All the semantic models and reports for the Research division must use version control that supports branching." Dataflows are not supported in the git integration. Hence I choose A as the answer.
upvoted 1 times
Nefirs
1 year ago
only semantic model and reports must use version control. However, dataflows are not mentioned, therefore, irrelevant whether supported or not.
upvoted 2 times
...
...
nyoike
8 months, 3 weeks ago
Selected Answer: D
I was initially leaning to A but got real confused when I read the choices again. Using FABRIC data factory (one would presume that what they would mean in a FABRIC exam), when you use a Stored Procedure activity, you only see Warehouses and other SQL sources and NOT Lakehouses. Using Azure Data Factory, one could add an Azure SQL DB linked service and connect to the SQL Endpoint of a Lakehouse and execute a stored procedure associated with that SQL Endpoint. Even for Fabric Pipelines, one could use an Azure SQL Database connection (instead of Lakehouse), connect to the SQL Endpoint of a Lakehouse and execute a stored procedure associated with that SQL Endpoint. This I believe is the most efficient way to do it. The issue I have with D is the fact that Dataflows require significant resources to spin up and execute. Good thing with it is that there is no ambiguity mentioned above and if you want to get the answer right, might not be the most efficient but without more verbosity in the choices, I painfully chose it.
upvoted 2 times
semauni
5 months, 4 weeks ago
And that's how you pass Microsoft exams, by (painfully) choosing the Microsoft way :) It's never 100% about what is 'the best' answer according to you or the community, it's about the answer that Microsoft will count as right. In all seriosity: the general requirements do state that "implementation and maintenance efforts" should be minimized where possible. Writing a stored procedure to incrementally load a table gets very complex very fast.
upvoted 1 times
...
...
agente232
10 months, 1 week ago
I asked chatgpt and I've got this: Based on the requirements for the semantic model of the Online Sales department, the best solution to refresh the Orders table would be to include an Azure Data Factory pipeline that executes a Stored procedure activity to retrieve the maximum value of the OrderID column in the destination lakehouse. This approach ensures that only new orders are processed, maintaining the sequence and integrity of the OrderID values as per the system of origin. Therefore, the correct answer is: A. an Azure Data Factory pipeline that executes a Stored procedure activity to retrieve the maximum value of the OrderID column in the destination lakehouse.
upvoted 1 times
...
ca63a55
10 months, 4 weeks ago
IMHO, it has to be done with dataflow (D) because the semantic model uses an Import mode so I think it doesn't support a store procedure (SQL)
upvoted 3 times
...
Fer079
11 months, 2 weeks ago
Selected Answer: A
Here the key words are "to retrieve", so if you run a pipeline to execute something to retrieve a value then it should be a Store Procedure using the lookup activity. This is the most effective way to do it. The answers are not telling you the entire process to insert the new data (which it could be with dataflow) else it is telling you what activity to use in the pipeline to retrieve the maximum value of the OrderID. At least this is what I understood.
upvoted 3 times
...
Community vote distribution
A (35%)
C (25%)
B (20%)
Other
Most Voted
A voting comment increases the vote count for the chosen answer by one.

Upvoting a comment with a selected answer will also increase the vote count towards that answer by one. So if you see a comment that you already agree with, you can upvote it instead of posting a new comment.

SaveCancel
Loading ...
exam
Someone Bought Contributor Access for:
SY0-701
London, 1 minute ago