exam questions

Exam DP-700 All Questions

View all questions & answers for the DP-700 exam

Exam DP-700 topic 1 question 37 discussion

Actual exam question from Microsoft's DP-700
Question #: 37
Topic #: 1
[All DP-700 Questions]

Case Study -

This is a case study. Case studies are not timed separately. You can use as much exam time as you would like to complete each case. However, there may be additional case studies and sections on this exam. You must manage your time to ensure that you are able to complete all questions included on this exam in the time provided.

To answer the questions included in a case study, you will need to reference information that is provided in the case study. Case studies might contain exhibits and other resources that provide more information about the scenario that is described in the case study. Each question is independent of the other questions in this case study.

At the end of this case study, a review screen will appear. This screen allows you to review your answers and to make changes before you move to the next section of the exam. After you begin a new section, you cannot return to this section.


To start the case study -
To display the first question in this case study, click the Next button. Use the buttons in the left pane to explore the content of the case study before you answer the questions. Clicking these buttons displays information such as business requirements, existing environment, and problem statements. If the case study has an All Information tab, note that the information displayed is identical to the information displayed on the subsequent tabs. When you are ready to answer a question, click the Question button to return to the question.


Overview -

Litware, Inc. is a publishing company that has an online bookstore and several retail bookstores worldwide. Litware also manages an online advertising business for the authors it represents.

Existing Environment. Fabric Environment

Litware has a Fabric workspace named Workspace1. High concurrency is enabled for Workspace1.

The company has a data engineering team that uses Python for data processing.

Existing Environment. Data Processing

The retail bookstores send sales data at the end of each business day, while the online bookstore constantly provides logs and sales data to a central enterprise resource planning (ERP) system.

Litware implements a medallion architecture by using the following three layers: bronze, silver, and gold. The sales data is ingested from the ERP system as Parquet files that land in the Files folder in a lakehouse. Notebooks are used to transform the files in a Delta table for the bronze and silver layers. The gold layer is in a warehouse that has V-Order disabled.

Litware has image files of book covers in Azure Blob Storage. The files are loaded into the Files folder.

Existing Environment. Sales Data

Month-end sales data is processed on the first calendar day of each month. Data that is older than one month never changes.

In the source system, the sales data refreshes every six hours starting at midnight each day.

The sales data is captured in a Dataflow Gen1 dataflow. When the dataflow runs, new and historical data is captured. The dataflow captures the following fields of the source:

• Sales Date
• Author
• Price
• Units
• SKU

A table named AuthorSales stores the sales data that relates to each author. The table contains a column named AuthorEmail. Authors authenticate to a guest Fabric tenant by using their email address.

Existing Environment. Security Groups

Litware has the following security groups:

• Sales
• Fabric Admins
• Streaming Admins

Existing Environment. Performance Issues

Business users perform ad-hoc queries against the warehouse. The business users indicate that reports against the warehouse sometimes run for two hours and fail to load as expected. Upon further investigation, the data engineering team receives the following error message when the reports fail to load: “The SQL query failed while running.”

The data engineering team wants to debug the issue and find queries that cause more than one failure.

When the authors have new book releases, there is often an increase in sales activity. This increase slows the data ingestion process.

The company’s sales team reports that during the last month, the sales data has NOT been up-to-date when they arrive at work in the morning.


Requirements. Planned Changes -

Litware recently signed a contract to receive book reviews. The provider of the reviews exposes the data in Amazon Simple Storage Service (Amazon S3) buckets.

Litware plans to manage Search Engine Optimization (SEO) for the authors. The SEO data will be streamed from a REST API.


Requirements. Version Control -

Litware plans to implement a version control solution in Fabric that will use GitHub integration and follow the principle of least privilege.

Requirements. Governance Requirements

To control data platform costs, the data platform must use only Fabric services and items. Additional Azure resources must NOT be provisioned.


Requirements. Data Requirements -

Litware identifies the following data requirements:

• Process the SEO data in near-real-time (NRT).
• Make the book reviews available in the lakehouse without making a copy of the data.
• When a new book cover image arrives in the Files folder, process the image as soon as possible.


You need to ensure that processes for the bronze and silver layers run in isolation.

How should you configure the Apache Spark settings?

  • A. Disable high concurrency.
  • B. Create a custom pool.
  • C. Modify the number of executors.
  • D. Set the default environment.
Show Suggested Answer Hide Answer
Suggested Answer: B 🗳️

Comments

Chosen Answer:
This is a voting comment (?). It is better to Upvote an existing comment if you don't have anything to add.
Switch to a voting comment New
12a2ecc
5 days, 20 hours ago
Selected Answer: B
In Microsoft Fabric, when you want to isolate different workloads, such as data processing jobs for different medallion layers (bronze, silver, gold), the best practice is to use custom Spark pools. This allows: Isolation of resources: Ensuring the bronze and silver layers don’t interfere with each other in terms of compute or memory. Dedicated execution environments: Each pool can be optimized/configured differently based on the workload characteristics. Better performance and reliability: Avoid contention in high-load scenarios like month-end processing or spikes during new book releases.
upvoted 1 times
...
Biju1
1 week, 2 days ago
Selected Answer: B
B is correct
upvoted 1 times
...
abdulbasit170
1 week, 2 days ago
Selected Answer: B
B is the correct answer as they want complete isolation of the job runs
upvoted 2 times
...
abdulbasit170
1 week, 2 days ago
Selected Answer: A
A: is the correct answer.
upvoted 1 times
...
5e89616
1 week, 5 days ago
Selected Answer: A
The requirement is to run notebook sessions for Bronze and Silver in isolation -> disable high concurrency: High concurrency mode allows users to share the same Spark sessions in Apache Spark for Fabric data engineering and data science workloads. An item like a notebook uses a Spark session for its execution and when enabled allows users to share a single Spark session across multiple notebooks. https://learn.microsoft.com/en-us/fabric/data-engineering/workspace-admin-settings#high-concurrency
upvoted 2 times
12a2ecc
5 days, 20 hours ago
I think it is (B), custom pools: Why Not the Other Options? A. Disable high concurrency: Not suitable here. High concurrency mode allows multiple lightweight queries to run simultaneously. Disabling it doesn't help with workload isolation and might reduce performance for concurrent jobs. C. Modify the number of executors: This affects how Spark parallelizes tasks within a single job but doesn’t isolate workloads or prevent resource contention between different jobs. D. Set the default environment: This option just sets where notebooks run by default. It doesn't provide isolation between different layers.
upvoted 1 times
...
...
Community vote distribution
A (35%)
C (25%)
B (20%)
Other
Most Voted
A voting comment increases the vote count for the chosen answer by one.

Upvoting a comment with a selected answer will also increase the vote count towards that answer by one. So if you see a comment that you already agree with, you can upvote it instead of posting a new comment.

SaveCancel
Loading ...
exam
Someone Bought Contributor Access for:
SY0-701
London, 1 minute ago