exam questions

Exam AWS Certified AI Practitioner AIF-C01 All Questions

View all questions & answers for the AWS Certified AI Practitioner AIF-C01 exam

Exam AWS Certified AI Practitioner AIF-C01 topic 1 question 6 discussion

A company uses Amazon SageMaker for its ML pipeline in a production environment. The company has large input data sizes up to 1 GB and processing times up to 1 hour. The company needs near real-time latency.
Which SageMaker inference option meets these requirements?

  • A. Real-time inference
  • B. Serverless inference
  • C. Asynchronous inference
  • D. Batch transform
Show Suggested Answer Hide Answer
Suggested Answer: C 🗳️

Comments

Chosen Answer:
This is a voting comment (?). It is better to Upvote an existing comment if you don't have anything to add.
Switch to a voting comment New
jove
Highly Voted 3 months, 4 weeks ago
Selected Answer: C
Real-Time Inference: Immediate responses for high-traffic, low-latency applications. >> Asynchronous Inference: Near real-time for large payloads and longer processing. Batch Transform: Large-scale, offline processing without real-time needs. Serverless Inference: Low-latency inference for intermittent or unpredictable traffic without managing infrastructure.
upvoted 9 times
...
Amar949499
Most Recent 6 days, 4 hours ago
Selected Answer: C
Here the keyword is near “real time latency “ Asynchronous inference. queues incoming requests and processes them asynchronously. This option is ideal for requests with large payload sizes (up to 1GB), long processing times (up toAsynchronous Inference one hour), and near real-time latency requirements
upvoted 1 times
...
Nopnov
1 week, 1 day ago
Selected Answer: C
Amazon SageMaker Asynchronous Inference is a capability in SageMaker AI that queues incoming requests and processes them asynchronously. This option is ideal for requests with large payload sizes (up to 1GB), long processing times (up to one hour), and near real-time latency requirements
upvoted 1 times
...
JJwin
2 weeks, 3 days ago
Selected Answer: A
Real-time inference in Amazon SageMaker is designed for low-latency, high-throughput applications where predictions need to be made immediately after data is processed. Since the company requires near real-time latency for their ML pipeline and has processing times of up to 1 hour and input sizes up to 1 GB, real-time inference is the most suitable option. With real-time inference, you can deploy your trained models as an API endpoint and get predictions on demand, ensuring low latency. This is ideal for situations where you need immediate responses after submitting the data.
upvoted 1 times
...
Willdoit
2 weeks, 5 days ago
Selected Answer: A
The company requires near real-time latency, which means the model needs to respond quickly to inference requests. Real-time inference in Amazon SageMaker is designed for low-latency applications where predictions are needed in milliseconds to seconds. C. Asynchronous inference – Useful for large requests that take minutes or hours to process, but it is not real-time.
upvoted 2 times
...
Jessiii
2 weeks, 6 days ago
Selected Answer: C
Asynchronous inference in Amazon SageMaker is ideal when you have large input data sizes (like the 1 GB mentioned) and relatively long processing times (like up to 1 hour). While real-time inference typically offers lower latency, it may struggle with large datasets or complex models that require more processing time. In contrast, asynchronous inference can handle large inputs and longer processing times without needing immediate results. It processes the data and provides the results later, which might be acceptable if your requirement for near real-time latency can be slightly relaxed (for instance, if results can be retrieved within minutes rather than immediately).
upvoted 2 times
...
Moon
2 months ago
Selected Answer: C
C: Asynchronous inference Explanation: Asynchronous inference in Amazon SageMaker is specifically designed to handle large payloads (up to 1 GB) and long processing times (up to 1 hour). It decouples request submission from processing, allowing the client to submit a request and receive a response later when the inference is complete. This makes it suitable for use cases where real-time responses are not strictly required, but near real-time results are needed.
upvoted 3 times
...
Aryan_10
2 months, 1 week ago
Selected Answer: C
Whenever "near real-time latency" - asynchronous inference
upvoted 2 times
...
wmj
3 months ago
Selected Answer: C
C is right. Amazon SageMaker Asynchronous Inference is a capability in SageMaker that queues incoming requests and processes them asynchronously. This option is ideal for requests with large payload sizes (up to 1GB), long processing times (up to one hour), and near real-time latency requirements. Asynchronous Inference enables you to save on costs by autoscaling the instance count to zero when there are no requests to process, so you only pay when your endpoint is processing requests.
upvoted 3 times
...
wangyang_0622
3 months ago
Selected Answer: A
I think answer A is the correct one as the customer wants to have real-time inference, right?
upvoted 2 times
...
cuzzindavid
3 months, 3 weeks ago
Key word "real-time latency"
upvoted 1 times
cuzzindavid
3 months, 3 weeks ago
After looking at this...yes Asynchronous is appropriate
upvoted 2 times
...
...
sachin_koenig
4 months ago
Asynchronous inference PDF RSS Amazon SageMaker Asynchronous Inference is a capability in SageMaker that queues incoming requests and processes them asynchronously. This option is ideal for requests with large payload sizes (up to 1GB), long processing times (up to one hour), and near real-time latency requirements. Asynchronous Inference enables you to save on costs by autoscaling the instance count to zero when there are no requests to process, so you only pay when your endpoint is processing requests.
upvoted 3 times
...
galliaj
4 months ago
Amazon SageMaker Asynchronous Inference would be the appropriate option. Here’s why: • Handles Large Payloads: Asynchronous Inference is designed to handle large input payloads (up to several GBs) that are typically not suited for real-time, low-latency processing. • Long Processing Times: It supports inference requests that can take minutes to hours to complete, making it ideal for models that require significant processing time. • Near Real-Time Response: While it does not provide millisecond-level latency like real-time endpoints, it offers a more scalable and efficient solution for near real-time use cases where the response time can range from seconds to minutes.
upvoted 2 times
...
Community vote distribution
A (35%)
C (25%)
B (20%)
Other
Most Voted
A voting comment increases the vote count for the chosen answer by one.

Upvoting a comment with a selected answer will also increase the vote count towards that answer by one. So if you see a comment that you already agree with, you can upvote it instead of posting a new comment.

SaveCancel
Loading ...
exam
Someone Bought Contributor Access for:
SY0-701
London, 1 minute ago