Welcome to ExamTopics
ExamTopics Logo
- Expert Verified, Online, Free.
exam questions

Exam Certified Data Engineer Associate All Questions

View all questions & answers for the Certified Data Engineer Associate exam

Exam Certified Data Engineer Associate topic 1 question 19 discussion

Actual exam question from Databricks's Certified Data Engineer Associate
Question #: 19
Topic #: 1
[All Certified Data Engineer Associate Questions]

A data engineer runs a statement every day to copy the previous day’s sales into the table transactions. Each day’s sales are in their own file in the location "/transactions/raw".
Today, the data engineer runs the following command to complete this task:

After running the command today, the data engineer notices that the number of records in table transactions has not changed.
Which of the following describes why the statement might not have copied any new records into the table?

  • A. The format of the files to be copied were not included with the FORMAT_OPTIONS keyword.
  • B. The names of the files to be copied were not included with the FILES keyword.
  • C. The previous day’s file has already been copied into the table.
  • D. The PARQUET file format does not support COPY INTO.
  • E. The COPY INTO statement requires the table to be refreshed to view the copied rows.
Show Suggested Answer Hide Answer
Suggested Answer: C 🗳️

Comments

Chosen Answer:
This is a voting comment (?) , you can switch to a simple comment.
Switch to a voting comment New
806e7d2
3 days ago
Selected Answer: C
In Databricks, the COPY INTO command is designed to prevent duplicate data ingestion. When files are copied into a table, Databricks keeps track of the files that have already been processed using a file log. If a file has already been copied, subsequent runs of the COPY INTO command will skip that file to avoid duplication.
upvoted 1 times
...
Nika12
9 months, 4 weeks ago
Selected Answer: C
Just got 100% on the test. C was correct.
upvoted 4 times
...
SerGrey
10 months, 3 weeks ago
Selected Answer: C
Correct answer is C
upvoted 2 times
...
Garyn
10 months, 3 weeks ago
Selected Answer: C
C. The previous day’s file has already been copied into the table. The COPY INTO statement is generally used to copy data from files or a location into a table. If the data engineer runs this statement daily to copy the previous day’s sales into the "transactions" table and the number of records hasn't changed after today's execution, it's possible that the data from today's file might not have differed from the data already present in the table. If the files in the "/transactions/raw" location are expected to contain distinct data for each day and the number of records in the table remains the same, it implies that the data engineer might have already copied today's data previously, or today's data was identical to the data already present in the table. Options A, B, D, and E don't accurately explain why the statement might not have copied new records into the table based on the provided scenario.
upvoted 3 times
...
awofalus
1 year ago
Selected Answer: C
C is correct
upvoted 2 times
...
kishanu
1 year, 1 month ago
If the table "transaction" is an external table, then option E, if its internal C should suffice.
upvoted 1 times
...
DavidRou
1 year, 1 month ago
Selected Answer: C
COPY INTO statement does skip already copied rows.
upvoted 1 times
...
KalavathiP
1 year, 1 month ago
Selected Answer: C
C is correct ans
upvoted 1 times
...
ezeik
1 year, 1 month ago
Selected Answer: E
E is the correct answer, because immediately after using copy into you might query the cashed version of the table.
upvoted 4 times
...
AndreFR
1 year, 3 months ago
Selected Answer: C
https://docs.databricks.com/en/ingestion/copy-into/index.html The COPY INTO SQL command lets you load data from a file location into a Delta table. This is a re-triable and idempotent operation; files in the source location that have already been loaded are skipped. if there are no new records, the only consistent choice is C no new files were loaded because already loaded files were skipped.
upvoted 1 times
...
Atnafu
1 year, 4 months ago
C The COPY INTO statement copies the data from the specified files into the target table. If the previous day's file has already been copied into the table, then the COPY INTO statement will not copy any new records into the table.
upvoted 1 times
...
junction
1 year, 5 months ago
Selected Answer: C
COPY INTO Loads data from a file location into a Delta table. This is a retriable and idempotent operation—files in the source location that have already been loaded are skipped.
upvoted 1 times
...
testdb
1 year, 6 months ago
Selected Answer: B
Answer: B FILES = ('f1.json', 'f2.json', 'f3.json', 'f4.json', 'f5.json') https://docs.databricks.com/ingestion/copy-into/examples.html
upvoted 1 times
[Removed]
1 year, 6 months ago
The correct answer is letter C. The use of specific files names with keyword "FILES" is optional as the syntax of COPY INTO declares: [ FILES = ( file_name [, ...] ) | PATTERN = glob_pattern ] When keyword FILES is not used in the statement all files of the directory is used once (because this operation is idempotent).
upvoted 2 times
...
...
Varma_Saraswathula
1 year, 7 months ago
C- https://docs.databricks.com/ingestion/copy-into/tutorial-notebook.html Because this action is idempotent, you can run it multiple times but data will only be loaded once.
upvoted 1 times
...
XiltroX
1 year, 7 months ago
Selected Answer: C
Option C is the correct answer.
upvoted 3 times
...
mimzzz
1 year, 7 months ago
i am not sure whether C is the correct answer, but A is definitely not right
upvoted 1 times
...
sdas1
1 year, 7 months ago
option C
upvoted 1 times
...
Community vote distribution
A (35%)
C (25%)
B (20%)
Other
Most Voted
A voting comment increases the vote count for the chosen answer by one.

Upvoting a comment with a selected answer will also increase the vote count towards that answer by one. So if you see a comment that you already agree with, you can upvote it instead of posting a new comment.

SaveCancel
Loading ...