Exam DP-203 All Questions

View all questions & answers for the DP-203 exam

Exam DP-203 topic 2 question 20 discussion

Actual exam question from Microsoft's DP-203

Question #: 20
Topic #: 2

DRAG DROP -
You need to create an Azure Data Factory pipeline to process data for the following three departments at your company: Ecommerce, retail, and wholesale. The solution must ensure that data can also be processed for the entire company.
How should you complete the Data Factory data flow script? To answer, drag the appropriate values to the correct targets. Each value may be used once, more than once, or not at all. You may need to drag the split bar between panes or scroll to view content.
NOTE: Each correct selection is worth one point.
Select and Place:

Show Suggested Answer

Suggested Answer:

The conditional split transformation routes data rows to different streams based on matching conditions. The conditional split transformation is similar to a CASE decision structure in a programming language. The transformation evaluates expressions, and based on the results, directs the data row to the specified stream.
Box 1: dept=='ecommerce', dept=='retail', dept=='wholesale'
First we put the condition. The order must match the stream labeling we define in Box 3.
Syntax:
<incomingStream>
split(
<conditionalExpression1>
<conditionalExpression2>
...
disjoint: {true | false}
) ~> <splitTx>@(stream1, stream2, ..., <defaultStream>)

Box 2: discount : false -
disjoint is false because the data goes to the first matching condition. All remaining rows matching the third condition go to output stream all.
Box 3: ecommerce, retail, wholesale, all

Label the streams -
Reference:
https://docs.microsoft.com/en-us/azure/data-factory/data-flow-conditional-split

by Alekx42 at June 2, 2021, 4:36 p.m.

Comments

Submit Cancel

Alekx42

Highly Voted 4 years, 1 month ago

I think "disjoint" should be True, so that data can be sent to all matching conditions. In this way the "all" output can get the data from every department, which ensures that "data can also be processed by the entire company".

upvoted 85 times

gssd4scoder

3 years, 8 months ago

agree with you, disjount = true

upvoted 3 times

...

DataSaM

2 years ago

Disagree, all is like an else

upvoted 3 times

...

MrityunjayPrabhat

2 years, 10 months ago

All is not defined in split so it has to be false. Refer https://docs.microsoft.com/en-us/azure/data-factory/data-flow-conditional-split#:~:text=CleanData%0A%20%20%20%20split(%0A%20%20%20%20%20%20%20%20year%20%3C%201960%2C%0A%09%20%20%20%20year%20%3E%201980%2C%0A%09%20%20%20%20disjoint%3A%20false%0A%20%20%20%20)%20~%3E%20SplitByYear%40(moviesBefore1960%2C%20moviesAfter1980%2C%20AllOtherMovies)

upvoted 6 times

kkk5566

1 year, 10 months ago

disjoint is false because the data goes to the first matching condition rather than all matching conditions.

upvoted 2 times

...

jed_elhak

3 years, 9 months ago

yes it's True :Disjoint=True

upvoted 4 times

...

Load full discussion...

...

mayank

Highly Voted 4 years, 1 month ago

As per the link provided in the explanation disjoint:false looks correct. I believe you must go through the link https://docs.microsoft.com/en-us/azure/data-factory/data-flow-conditional-split and choose you answer for disjoint wisely . I will go with "False"

upvoted 47 times

dev2dev

3 years, 5 months ago

you also need to read question to understand requirement. I will choose disjoint: true

upvoted 4 times

...

auwia

2 years ago

From the link you've posted: disjoint is false because the data goes to the first matching condition rather than all matching conditions. So the correct answer is True, considering we have to "duplicate" records for the ALL category.

upvoted 2 times

...

f2a9aa5

Most Recent 11 months, 2 weeks ago

Guys, Guys, Guys..... The clue is in the options given. If default: true was the right answer, then options A and F would be the same. Either of them could have fulfilled the criteria. If only one of them is right (and that is what we expect) then the order matters. And order only matters when default: false. This also means that 'all' is slightly misleading. Refer back to the question: it does not imply all of the data needs to be available, just that the 'entire company can process'. Which is still okay if the 'all' had everything but ecommerce, retail and wholesale. Final point: If default: true was the right answer, options B and C would be the same. Either of the them could have worked. Conclusion: default: false.

upvoted 2 times

f2a9aa5

11 months, 2 weeks ago

typo: replace default by disjoint

upvoted 1 times

...

evangelist

11 months, 4 weeks ago

disjoint: true: If set to true, the row will be sent to all matching conditions. This means that a single row can appear in multiple output streams if it matches multiple conditions. disjoint: false: If set to false, the row will be sent to the first matching condition only. Once a row matches a condition, it will not be evaluated against subsequent conditions.

upvoted 1 times

...

alphilla

1 year, 6 months ago

Guys Disjoint is True 110% and I will tell you why. disjoint: false means that rows will be directed to the first branch whose condition is satisfied, and subsequent conditions are ignored. This might not fulfill the requirement because you want to process data for multiple departments, and with disjoint: false, a row would go to the first department branch it satisfies, ignoring the other departments. Disjoint TRUE is more appropriate because it fulfills the requirement of processing data for individual departments (Ecommerce, retail, and wholesale) while also handling data for the entire company. Because all rows will match 2 conditions: 1st conditon. They will have one of the three depts 2nd Condition. They will match the all condition That's why it MUST BE TRUE.

upvoted 2 times

...

kkk5566

1 year, 10 months ago

False is correct

upvoted 2 times

...

orionduo

2 years ago

I think the disjoint should be 'False' By setting "disjoint true" for activities in a pipeline, you are essentially indicating that these activities are independent and can be executed concurrently. This can help improve the overall performance and efficiency of the pipeline by allowing for parallel execution of activities that do not have any interdependencies.

upvoted 1 times

...

bakamon

2 years, 1 month ago

CleanData split(dept==‘ecommerce’, dept==‘retail’, dept==‘wholesale’) ~> SplitByDept@(disjoint: false) This will split the data by department and allow for processing of data for the entire company as well as for individual departments.

upvoted 2 times

bakamon

2 years, 1 month ago

The disjoint option in a split transformation determines whether the output streams are mutually exclusive or not. If disjoint is set to true, then each row of data can only be sent to one output stream. If disjoint is set to false, then a single row of data can be sent to multiple output streams. In this case, setting disjoint to false allows for data to be processed for the entire company as well as for individual departments. This means that a single row of data can be sent to multiple output streams, allowing for processing at both the department and company level.

upvoted 2 times

...

markpumc

2 years, 3 months ago

disjoin = true if you want all , if disjoint = false, nothing in ALL split

upvoted 4 times

...

DPMishra

2 years, 5 months ago

Disjoint=False

upvoted 1 times

...

DindaS

2 years, 5 months ago

disjoint=false The below example is a conditional split transformation named SplitByYear that takes in incoming stream CleanData. This transformation has two split conditions year < 1960 and year > 1980. disjoint is false because the data goes to the first matching condition rather than all matching conditions. Every row matching the first condition goes to output stream moviesBefore1960. All remaining rows matching the second condition go to output stream moviesAFter1980. All other rows flow through the default stream AllOtherMovies. from https://learn.microsoft.com/en-us/azure/data-factory/data-flow-conditional-split

upvoted 4 times

...

nadahef

2 years, 6 months ago

Given answer correct

upvoted 2 times

...

Maddhy

2 years, 7 months ago

The given answer is 100000% crct, don't confuse with others

upvoted 2 times

...

Aslam208

2 years, 10 months ago

Given answer is 100% correct

upvoted 6 times

...

kiranSargar

3 years ago

Everyone is discussing about disjoint. But if disjoint is true then there is no ordering required of ecommerce,retail,wholesale, all .so we can fill 1st option with 2 or 3 and 3rd option with 1 or 6.

upvoted 2 times

...

nefarious_smalls

3 years, 2 months ago

I think it should be disjoint is True based on microsofts example. it states that when disjoint is false each row will only go to the first matching condition. However in the example I believe each row will go to its matching department plus an aggregate stream that takes in every value regardless. Hence disjoint should be true

upvoted 1 times

...

Andushi

3 years, 2 months ago

Definetely Disjoint=Trues as per Microsoft doc

upvoted 2 times

...

Load full discussion...