exam questions

Exam AWS Certified Solutions Architect - Professional SAP-C02 All Questions

View all questions & answers for the AWS Certified Solutions Architect - Professional SAP-C02 exam

Exam AWS Certified Solutions Architect - Professional SAP-C02 topic 1 question 138 discussion

A life sciences company is using a combination of open source tools to manage data analysis workflows and Docker containers running on servers in its on-premises data center to process genomics data. Sequencing data is generated and stored on a local storage area network (SAN), and then the data is processed. The research and development teams are running into capacity issues and have decided to re-architect their genomics analysis platform on AWS to scale based on workload demands and reduce the turnaround time from weeks to days.

The company has a high-speed AWS Direct Connect connection. Sequencers will generate around 200 GB of data for each genome, and individual jobs can take several hours to process the data with ideal compute capacity. The end result will be stored in Amazon S3. The company is expecting 10-15 job requests each day.

Which solution meets these requirements?

  • A. Use regularly scheduled AWS Snowball Edge devices to transfer the sequencing data into AWS. When AWS receives the Snowball Edge device and the data is loaded into Amazon S3, use S3 events to trigger an AWS Lambda function to process the data.
  • B. Use AWS Data Pipeline to transfer the sequencing data to Amazon S3. Use S3 events to trigger an Amazon EC2 Auto Scaling group to launch custom-AMI EC2 instances running the Docker containers to process the data.
  • C. Use AWS DataSync to transfer the sequencing data to Amazon S3. Use S3 events to trigger an AWS Lambda function that starts an AWS Step Functions workflow. Store the Docker images in Amazon Elastic Container Registry (Amazon ECR) and trigger AWS Batch to run the container and process the sequencing data.
  • D. Use an AWS Storage Gateway file gateway to transfer the sequencing data to Amazon S3. Use S3 events to trigger an AWS Batch job that executes on Amazon EC2 instances running the Docker containers to process the data.
Show Suggested Answer Hide Answer
Suggested Answer: C 🗳️

Comments

Chosen Answer:
This is a voting comment (?). It is better to Upvote an existing comment if you don't have anything to add.
Switch to a voting comment New
dev112233xx
Highly Voted 2 years ago
Selected Answer: C
Almost voted D because of the Storage Gateway + SAN combination.. but seems like it's not correct since S3 events cannot trigger Batch jobs directly, you need a Lambda function! S3 events can be only Lambda,SNS or SQS..
upvoted 23 times
Kampton
2 years ago
Agree - The Lambda function acts as a bridge between the S3 event and AWS Batch, allowing you to trigger AWS Batch jobs in response to S3 events.
upvoted 3 times
...
...
God_Is_Love
Highly Voted 2 years, 1 month ago
Selected Answer: D
Guys its Tricky one between C and D and answer is D! (Modernization question) Look at this two below blogs : https://aws.amazon.com/blogs/storage/using-aws-storage-gateway-to-modernize-next-generation-sequencing-workflows/ Thanks to tinyflame who made me do my research on this :-) Yes, SAN -> Storage Gateway Only NAS -> Data Sync or Storage Gateway https://aws.amazon.com/blogs/storage/from-on-premises-to-aws-hybrid-cloud-architecture-for-network-file-shares/
upvoted 9 times
AWSum1
6 months, 3 weeks ago
Nope, you need S3 events to trigger Lambda. S3 events cannot trigger batch
upvoted 1 times
...
helloworldabc
7 months, 3 weeks ago
just C
upvoted 1 times
...
God_Is_Love
2 years, 1 month ago
On Premise NAS and file servers to S3. --> Use DataSync solution On Premise SMB or NFS file share to S3 --> Use Storage/File Gateway solution
upvoted 4 times
...
titi_r
1 year, 1 month ago
@God_Is_Love, both articles you've provided are NOT mentioning "SAN" at all. You cannot copy data from SAN using storage GW, but you do it with DataSync ran from within a server, which is connected to that SAN. Research more on what SAN is and how does it work :)
upvoted 1 times
...
...
FZA24
Most Recent 6 months, 1 week ago
Selected Answer: C
DataSync + Direct Connect S3 => Lambda => SF Docker => ECR => Batch
upvoted 1 times
...
k10training02
8 months, 2 weeks ago
lambda solo dura 900 segundos me voy por la D
upvoted 1 times
helloworldabc
7 months, 3 weeks ago
just C
upvoted 1 times
...
...
trungtd
11 months ago
Selected Answer: C
Currently, S3 events can only push to three different types of destinations: SNS topic, SQS Queue, AWS Lamba. You cannot directly trigger a Batch job by S3 Event
upvoted 1 times
...
ninomfr64
1 year, 3 months ago
Selected Answer: C
A = 200GB very now and then doesn't need Snowball Edge B = Data Pipeline is ETL and not suitable in hybrid scenarios C = correct (DataSync does the job, also the app is already container based and it works well with Batch that is suited for HPC kind of workload - genomic sequencing is a typical HPC workload) D = even tough Storage Gateway does the job you cannot directly trigger a AWS Batch job from an S3 event, you need either a Lambda in the middle or enable EventBrdige notification and create a rule that triggers the AWS Batch Job
upvoted 3 times
...
cox1960
1 year, 3 months ago
... "The main requirement is that the data needs to be accessible over the network in a file format like NFS that DataSync supports."
upvoted 1 times
...
cox1960
1 year, 3 months ago
C - Amazon Q says "While it does not directly support SAN (storage area network), you can use AWS DataSync to transfer data from files stored on a SAN volume to AWS storage services like Amazon S3."
upvoted 1 times
...
career360guru
1 year, 4 months ago
Selected Answer: C
Option C is better option. Though D is also possible but as the jobs are already container based C would be better. Question is not clear whether containers used on-premise are docker based containers.
upvoted 2 times
...
mosalahs
1 year, 4 months ago
Selected Answer: C
Data Transfer --- > Data Sync Data Integration --- > Storage GW Data Orchestration --- > Data Pipeline
upvoted 3 times
...
Maygam
1 year, 4 months ago
Selected Answer: C
D doesn't seem to be correct as AWS Batch is not a destination for AWS S3 events. https://docs.aws.amazon.com/AmazonS3/latest/userguide/notification-how-to-event-types-and-destinations.html
upvoted 2 times
...
uC6rW1aB
1 year, 7 months ago
Selected Answer: C
Option C: Use AWS DataSync to transfer data to Amazon S3. DataSync is designed for fast, easy and secure data transfer. This option also uses S3 events to trigger an AWS Lambda function, which launches an AWS Step Functions workflow and runs a Docker container using AWS Batch. This option takes into account data transfer, processing and container management, and should be the most suitable solution. Option D: Use AWS Storage Gateway's file gateway to transfer data to Amazon S3. Storage Gateway is suitable for hybrid cloud environments, but in this case, since the company already has a high-speed AWS Direct Connect connection, it will be more efficient to use DataSync.
upvoted 2 times
...
Ganshank
1 year, 8 months ago
C. Of the given options C is probably the closest. Step Functions can be used to model the workflow. D does not specify this. DataSync can be used to transfer data [https://docs.aws.amazon.com/datasync/latest/userguide/s3-cross-account-transfer.html].
upvoted 1 times
...
SK_Tyagi
1 year, 8 months ago
Selected Answer: D
I choose D. My rationale - 200GB data for 1 genome sequence, Lets say DirectConnect is 1Gbps line, DataSync cannot efficiently transfer the data to get the processing under 1 day. Agree with God_Is_Love's hypothesis
upvoted 1 times
vn_thanhtung
1 year, 7 months ago
S3 event can't trigger direct AWS Batch job. => C
upvoted 1 times
...
ninomfr64
1 year, 3 months ago
Assuming DX is 1Gbps, it takes about 27 minutes to transfer 200GB. also, I don't see how Storage Gateway can speedup things. My point is that here both DataSynch and Storage Gateway can di the job, but you cannot trigger Batch job directly from S3 object event. Thus C
upvoted 1 times
...
...
RGR21
1 year, 8 months ago
Does the AWS DataSync support SAN?
upvoted 1 times
...
ggrodskiy
1 year, 9 months ago
Correct D.
upvoted 1 times
...
NikkyDicky
1 year, 9 months ago
Selected Answer: C
C D would be an option if using volume gateway and lambda to trigger batch datasync dont need to support NAS. agent can copy off of NFS or SMB mount of the NAS drive.
upvoted 1 times
...
Community vote distribution
A (35%)
C (25%)
B (20%)
Other
Most Voted
A voting comment increases the vote count for the chosen answer by one.

Upvoting a comment with a selected answer will also increase the vote count towards that answer by one. So if you see a comment that you already agree with, you can upvote it instead of posting a new comment.

SaveCancel
Loading ...
exam
Someone Bought Contributor Access for:
SY0-701
London, 1 minute ago