exam questions

Exam AWS Certified Machine Learning - Specialty All Questions

View all questions & answers for the AWS Certified Machine Learning - Specialty exam

Exam AWS Certified Machine Learning - Specialty topic 1 question 61 discussion

A Machine Learning Specialist is working with a large cybersecurity company that manages security events in real time for companies around the world. The cybersecurity company wants to design a solution that will allow it to use machine learning to score malicious events as anomalies on the data as it is being ingested. The company also wants be able to save the results in its data lake for later processing and analysis.
What is the MOST efficient way to accomplish these tasks?

  • A. Ingest the data using Amazon Kinesis Data Firehose, and use Amazon Kinesis Data Analytics Random Cut Forest (RCF) for anomaly detection. Then use Kinesis Data Firehose to stream the results to Amazon S3.
  • B. Ingest the data into Apache Spark Streaming using Amazon EMR, and use Spark MLlib with k-means to perform anomaly detection. Then store the results in an Apache Hadoop Distributed File System (HDFS) using Amazon EMR with a replication factor of three as the data lake.
  • C. Ingest the data and store it in Amazon S3. Use AWS Batch along with the AWS Deep Learning AMIs to train a k-means model using TensorFlow on the data in Amazon S3.
  • D. Ingest the data and store it in Amazon S3. Have an AWS Glue job that is triggered on demand transform the new data. Then use the built-in Random Cut Forest (RCF) model within Amazon SageMaker to detect anomalies in the data.
Show Suggested Answer Hide Answer
Suggested Answer: A 🗳️

Comments

Chosen Answer:
This is a voting comment (?). It is better to Upvote an existing comment if you don't have anything to add.
Switch to a voting comment New
DonaldCMLIN
Highly Voted 3 years, 1 month ago
I WOULD LIKE TO CHOOSE ANSWER A. https://aws.amazon.com/tw/blogs/machine-learning/use-the-built-in-amazon-sagemaker-random-cut-forest-algorithm-for-anomaly-detection/
upvoted 60 times
hamimelon
1 year, 10 months ago
Donald, do you know your CAPS LOCK has been on the whole time?
upvoted 15 times
Nadia0012
1 year, 7 months ago
I know why his caps lock has been on :D to enter the "I am not robot" code easier :D
upvoted 4 times
ccpmad
1 year, 2 months ago
yes, but it works with minus also...
upvoted 1 times
...
...
...
...
JayK
Highly Voted 3 years ago
Answer is A. As the word anamoly talks about Random Cut Forest in the exam and that can be done in a cost effective manner using Kinesis Data Analytics
upvoted 15 times
Shakespeare
4 months, 1 week ago
I think it would have been more accurate if the options were kinetic data stream -> kinesis data analytics -> kinesis firehose -> S3
upvoted 2 times
...
...
saclim
Most Recent 6 months, 1 week ago
The question says REAL TIME events doesn't that eliminate Data Firehose as it is technically NEAR real time but not real time like Data Stream? Though Random Cut Forest seems like the best option for anomaly detection. I'm torn between A and B
upvoted 1 times
...
vkbajoria
6 months, 3 weeks ago
Selected Answer: A
Kinesis Firehose and Data Analytics with random cut forest should do it.
upvoted 1 times
...
phdykd
9 months, 2 weeks ago
A. Based on these considerations, Option A is the most efficient way to accomplish the tasks. It provides a seamless, real-time data ingestion and processing pipeline, leverages machine learning for anomaly detection, and efficiently stores data in a data lake, meeting all the key requirements of the cybersecurity company.
upvoted 1 times
...
elvin_ml_qayiran25091992razor
11 months, 2 weeks ago
Selected Answer: A
ONLY A
upvoted 1 times
...
sonoluminescence
12 months ago
Selected Answer: A
B not as efficient for real-time processing and storing results as using Kinesis services.
upvoted 2 times
...
DimLam
12 months ago
Selected Answer: B
At least B is a possible solution, but A will not work as KDF doesn't support KDA as a destination service https://docs.aws.amazon.com/firehose/latest/dev/create-name.html . In my opinion, KDF should always be the latest Kinesis Service in a streaming pipeline
upvoted 1 times
Dun6
11 months, 1 week ago
KDF does support KDA as destination
upvoted 1 times
...
...
AmeeraM
1 year ago
Selected Answer: A
A has all the required steps
upvoted 1 times
...
loict
1 year, 1 month ago
Selected Answer: A
A. YES - Firehose can pipe into KDA, and KDA supports RCF B. NO - RCF best for anomality detection C. NO - no need for intermediary S3 storage D. NO - no need for intermediary S3 storage
upvoted 1 times
...
Mickey321
1 year, 1 month ago
Selected Answer: A
option A
upvoted 1 times
...
kaike_reis
1 year, 2 months ago
Selected Answer: A
A is the correct. One tip for the exam: When you see Data Streaming, possibly the solution should contains a Kinesis Service. B is too much complex!
upvoted 3 times
...
nilmans
1 year, 4 months ago
Selected Answer: A
Makes sense to select A here.
upvoted 1 times
...
earthMover
1 year, 5 months ago
Selected Answer: A
I strongly believe A is the right answer. At a minimum there should be some justification provided for your answer.
upvoted 1 times
...
AjoseO
1 year, 8 months ago
Selected Answer: A
Amazon Kinesis Data Firehose is a fully managed service for streaming real-time data to Amazon S3 and can handle the ingestion of large amounts of data in real time. Kinesis Data Analytics Random Cut Forest (RCF) is a fully managed service that can be used to perform anomaly detection on streaming data, making it well suited for this use case. The results of the anomaly detection can then be streamed to Amazon S3 using Kinesis Data Firehose, providing a scalable and cost-effective data lake for later processing and analysis.
upvoted 2 times
DimLam
12 months ago
The problem with A, is that there is that KDF doesn't support KDA as a destination service https://docs.aws.amazon.com/firehose/latest/dev/create-name.html . In my opinion, KDF should always be the latest Kinesis Service in a streaming pipeline
upvoted 1 times
...
...
OssamaAbdelatif
1 year, 11 months ago
I would select A
upvoted 1 times
...
ovokpus
2 years, 4 months ago
Selected Answer: A
B is too resource intensive for that use case. I choose A, but I think the data should be better ingested using Kinesis streams
upvoted 3 times
...
Community vote distribution
A (35%)
C (25%)
B (20%)
Other
Most Voted
A voting comment increases the vote count for the chosen answer by one.

Upvoting a comment with a selected answer will also increase the vote count towards that answer by one. So if you see a comment that you already agree with, you can upvote it instead of posting a new comment.

SaveCancel
Loading ...
exam
Someone Bought Contributor Access for:
SY0-701
London, 1 minute ago