Welcome to ExamTopics
ExamTopics Logo
- Expert Verified, Online, Free.
exam questions

Exam AWS Certified Data Engineer - Associate DEA-C01 All Questions

View all questions & answers for the AWS Certified Data Engineer - Associate DEA-C01 exam

Exam AWS Certified Data Engineer - Associate DEA-C01 topic 1 question 48 discussion

A data engineer is using Amazon Athena to analyze sales data that is in Amazon S3. The data engineer writes a query to retrieve sales amounts for 2023 for several products from a table named sales_data. However, the query does not return results for all of the products that are in the sales_data table. The data engineer needs to troubleshoot the query to resolve the issue.
The data engineer's original query is as follows:
SELECT product_name, sum(sales_amount)

FROM sales_data -

WHERE year = 2023 -

GROUP BY product_name -
How should the data engineer modify the Athena query to meet these requirements?

  • A. Replace sum(sales_amount) with count(*) for the aggregation.
  • B. Change WHERE year = 2023 to WHERE extract(year FROM sales_data) = 2023.
  • C. Add HAVING sum(sales_amount) > 0 after the GROUP BY clause.
  • D. Remove the GROUP BY clause.
Show Suggested Answer Hide Answer
Suggested Answer: B 🗳️

Comments

Chosen Answer:
This is a voting comment (?) , you can switch to a simple comment.
Switch to a voting comment New
GiorgioGss
Highly Voted 7 months, 4 weeks ago
Selected Answer: B
"SELECT product_name, sum(sales_amount) FROM sales_data WHERE extract(year FROM sales_date) = 2023 GROUP BY product_name;" A. This would change the query to count the number of rows instead of summing sales. C. This would filter out products with zero sales amounts. D. Removing the GROUP BY clause would result in a single sum of all sales amounts without grouping by product_name.
upvoted 10 times
...
pikuantne
Most Recent 2 weeks ago
None of these options make sense. I think the question is worded incorrectly. I understand that the problem is supposed to be: the products that did not have any sales in 2023 should also be visible in the report with sum of sales_amount = 0. So, the WHERE condition should be deleted and replaced with a CASE WHEN. That way all of the products in the table will be visible, but only sales for 2023 will be summed. Which is what I think this question is asking. None of the provided options do that.
upvoted 3 times
...
Shatheesh
1 month, 1 week ago
C, query in the question is correct you just need to get amounts grater than Zero
upvoted 2 times
...
valuedate
5 months, 3 weeks ago
Selected Answer: B
year should be the partition in s3 so its necessary to extract. its not a column
upvoted 3 times
...
VerRi
5 months, 4 weeks ago
Selected Answer: C
No need to extract the year again
upvoted 1 times
...
Just_Ninja
6 months ago
Selected Answer: C
https://docs.aws.amazon.com/kinesisanalytics/latest/sqlref/sql-reference-having-clause.html
upvoted 1 times
...
Snape
6 months, 2 weeks ago
Selected Answer: C
Wrong answers A. Replace sum(sales_amount) with count(*) for the aggregation. This option will return the count of records for each product, not the sum of sales amounts, which is the desired result. B. Change WHERE year = 2023 to WHERE extract(year FROM sales_data) = 2023. The year column likely stores the year value directly, so there's no need to extract it from a date or timestamp column. D. Remove the GROUP BY clause. Removing the GROUP BY clause will cause an error because the sum(sales_amount) aggregation function requires a GROUP BY clause to specify the grouping column (product_name in this case).
upvoted 1 times
...
khchan123
6 months, 2 weeks ago
B B. Change `WHERE year = 2023` to `WHERE extract(year FROM sales_data) = 2023`. The issue with the original query is that it assumes there is a column named `year` in the `sales_data` table. However, it's more likely that the date or timestamp information is stored in a single column, for example, a column named `sales_date`. To extract the year from a date or timestamp column, you need to use the `extract()` function in Athena SQL.
upvoted 2 times
...
chris_spencer
7 months ago
None of the answer makes senses. Option C will exclude any amount that is 0. This option would be correct if it is: Add HAVING sum(sales_amount) >= 0 after the GROUP BY clause.
upvoted 2 times
...
Christina666
7 months ago
Selected Answer: C
Gemini: C. Add HAVING sum(sales_amount) > 0 after the GROUP BY clause. Zero Sales Products: The original query is likely missing products that had zero sales amount in 2023. This modification filters the grouped results, ensuring only products with positive sales are displayed. Why Other Options Don't Address the Core Issue: A. Replace sum(sales_amount) with count(*) for the aggregation. This would show how many sales transactions a product had, but not if it generated any revenue. It wouldn't solve the issue of missing products. B. Change WHERE year = 2023 to WHERE extract(year FROM sales_data) = 2023. This is functionally equivalent to the original WHERE clause if the year column is already an integer type. It wouldn't fix missing products. D. Remove the GROUP BY clause. This would aggregate all sales for 2023 with no product breakdown, losing the granularity needed.
upvoted 2 times
...
kj07
8 months ago
Not A because the engineer wants a sum not the total count. Not C because it will filter out the data with sales_amount zero. Not D because it will return just one result and the engineer wants the sales for multiple products. B should be the right answer if the sales_data is a date field.
upvoted 4 times
DevoteamAnalytix
6 months, 1 week ago
But this is the table name...
upvoted 1 times
...
Felix_G
8 months ago
Add HAVING sum(sales_amount) > 0 this does NOT filter out the data with sales_mount zero. it can not be any negative value. The original query’s WHERE year = 2023 condition is already appropriate for filtering data by the year 2023, so that B is unnecessary.
upvoted 1 times
FuriouZ
7 months, 3 weeks ago
>0 filters out every product which is not sold. The question was about "some products are not displayed" so using the having argument can not be the right choice
upvoted 3 times
...
...
...
rralucard_
9 months, 2 weeks ago
Selected Answer: C
https://docs.aws.amazon.com/athena/latest/ug/select.html
upvoted 2 times
nyaopoko
7 months, 1 week ago
answer is C!
upvoted 1 times
...
...
Community vote distribution
A (35%)
C (25%)
B (20%)
Other
Most Voted
A voting comment increases the vote count for the chosen answer by one.

Upvoting a comment with a selected answer will also increase the vote count towards that answer by one. So if you see a comment that you already agree with, you can upvote it instead of posting a new comment.

SaveCancel
Loading ...