Welcome to ExamTopics
ExamTopics Logo
- Expert Verified, Online, Free.
exam questions

Exam AWS Certified Solutions Architect - Professional SAP-C02 All Questions

View all questions & answers for the AWS Certified Solutions Architect - Professional SAP-C02 exam

Exam AWS Certified Solutions Architect - Professional SAP-C02 topic 1 question 23 discussion

A company is running a data-intensive application on AWS. The application runs on a cluster of hundreds of Amazon EC2 instances. A shared file system also runs on several EC2 instances that store 200 TB of data. The application reads and modifies the data on the shared file system and generates a report. The job runs once monthly, reads a subset of the files from the shared file system, and takes about 72 hours to complete. The compute instances scale in an Auto Scaling group, but the instances that host the shared file system run continuously. The compute and storage instances are all in the same AWS Region.
A solutions architect needs to reduce costs by replacing the shared file system instances. The file system must provide high performance access to the needed data for the duration of the 72-hour run.
Which solution will provide the LARGEST overall cost reduction while meeting these requirements?

  • A. Migrate the data from the existing shared file system to an Amazon S3 bucket that uses the S3 Intelligent-Tiering storage class. Before the job runs each month, use Amazon FSx for Lustre to create a new file system with the data from Amazon S3 by using lazy loading. Use the new file system as the shared storage for the duration of the job. Delete the file system when the job is complete.
  • B. Migrate the data from the existing shared file system to a large Amazon Elastic Block Store (Amazon EBS) volume with Multi-Attach enabled. Attach the EBS volume to each of the instances by using a user data script in the Auto Scaling group launch template. Use the EBS volume as the shared storage for the duration of the job. Detach the EBS volume when the job is complete
  • C. Migrate the data from the existing shared file system to an Amazon S3 bucket that uses the S3 Standard storage class. Before the job runs each month, use Amazon FSx for Lustre to create a new file system with the data from Amazon S3 by using batch loading. Use the new file system as the shared storage for the duration of the job. Delete the file system when the job is complete.
  • D. Migrate the data from the existing shared file system to an Amazon S3 bucket. Before the job runs each month, use AWS Storage Gateway to create a file gateway with the data from Amazon S3. Use the file gateway as the shared storage for the job. Delete the file gateway when the job is complete.
Show Suggested Answer Hide Answer
Suggested Answer: A 🗳️

Comments

Chosen Answer:
This is a voting comment (?) , you can switch to a simple comment.
Switch to a voting comment New
sambb
Highly Voted 1 year, 8 months ago
Selected Answer: A
A: Lazy loading is cost-effective because only a subset of data is used at every job B: There are hundreds of EC2 instances using the volume which is not possible (one EBS volume is limited to 16 nitro instances attached) C: Batching would load too much data D: storage gateway is used for on premises data access, I don't know is you can install a gateway in AWS, but Amazon would never advise this
upvoted 19 times
b3llman
1 year, 3 months ago
file storage gateway can be installed on EC2 and it is exactly used for accessing S3 from EC2 as a file system
upvoted 1 times
...
Chainshark
1 year, 1 month ago
It's used a lot, I've used it for customers to access and analyze data imported via Snowball from Windows machines.
upvoted 1 times
...
dqwsmwwvtgxwkvgcvc
1 year, 3 months ago
There is one S3 file gateway https://aws.amazon.com/storagegateway/file/s3/
upvoted 1 times
...
Tofu13
1 year, 1 month ago
https://aws.amazon.com/blogs/storage/new-enhancements-for-moving-data-between-amazon-fsx-for-lustre-and-amazon-s3/
upvoted 3 times
...
...
chico2023
Highly Voted 1 year, 3 months ago
Answer: D I think the main point here is to understand what they mean by "The file system must provide high performance access to the needed data" while "provide the LARGEST overall cost reduction"? For answer A, we have to remember that lazy load is SLOW for the first time you try to access the file (as it is being fetched from S3), BUT, as we are talking about hundreds of instances, then it might be OK. S3 Intelligent-Tiering, although doesn't seem to fit much, the part that says "The job runs once monthly, reads a subset of the files from the shared file system", indicates that at least part of the 200TB of data won't be accessed, which helps not going for answer C, for example. My only issue with answer D is that Storage Gateway can be slower than FSx for Lustre, HOWEVER, what is the cost X performance ratio they are seeking here? We can guess that costs trumps maximum performance here: "Which solution will provide the LARGEST overall cost reduction" and, as Storage Gateway is way cheaper than FSx for Lustre per TB, it's safe to say that D is the most correct answer.
upvoted 13 times
...
0b43291
Most Recent 3 days, 7 hours ago
Selected Answer: A
By choosing Option A, the company can leverage the cost-effectiveness of Amazon S3 Intelligent-Tiering for storage and the high performance of Amazon FSx for Lustre for temporary file access, while minimizing the overall cost by creating and deleting the file system only when needed. Option B (using Amazon EBS Multi-Attach) is not ideal because EBS volumes are designed for persistent storage, and attaching and detaching a large volume to multiple instances can be time-consuming and potentially disruptive. Option C (using Amazon FSx for Lustre with batch loading) is less cost-effective than Option A because batch loading requires loading the entire 200 TB of data into the file system, which can be expensive and time-consuming. Option D (using AWS Storage Gateway File Gateway) is not the most cost-effective solution because the File Gateway is designed for on-premises file storage integration and may not provide the same level of performance as FSx for Lustre for this data-intensive workload.
upvoted 1 times
...
amministrazione
2 months, 3 weeks ago
A. Migrate the data from the existing shared file system to an Amazon S3 bucket that uses the S3 Intelligent-Tiering storage class. Before the job runs each month, use Amazon FSx for Lustre to create a new file system with the data from Amazon S3 by using lazy loading. Use the new file system as the shared storage for the duration of the job. Delete the file system when the job is complete.
upvoted 1 times
...
MAZIADI
3 months, 1 week ago
A or D : confusion. I wish they can provide explanation about their answers when it is not the most voted one
upvoted 1 times
...
Helpnosense
5 months, 1 week ago
I vote D instead A because the requirement in the question is "modifies the data on the shared file system" Fsx imported data from s3 and lost the relationship to s3 after import is done Without explicitly copy back to s3, the change stays on shared file system only. Answer A solution doesn't provide a step to copy the modification back to s3. Storage gateway presents s3 storage to the OS as shared file system. Any modification on the shared file system will be automatically saved on s3.
upvoted 2 times
...
gofavad926
8 months, 1 week ago
Selected Answer: A
A: Lazy loading is cost-effective because only a subset of data is used at every job
upvoted 1 times
...
kz407
8 months, 1 week ago
Selected Answer: A
Problem with D is that, AWS Storage GW and File GW are solutions for integrating on-premise storage with AWS storage solutions, particularly (but not limited to) S3. https://aws.amazon.com/storagegateway/ https://aws.amazon.com/storagegateway/file Compute resources are residing in AWS, so having Storage GW and File GW won't solve a thing. As far as option B is concerned, it comes down to the limitations of EBS (such as the max block size, and max number of instance that can be attached etc). Also, attaching and detaching of the EBS volumes seems a bit complicated too. On top of that, EBS does not offer the cost optimizations offered by S3 Intelligent Tiering. The question clearly mentions that only a subset of the data will be used. Intelligent tiering ensures a substantial cost optimization over time. Hence, the answer should be A.
upvoted 3 times
...
kspendli
8 months, 1 week ago
Option D, migrating the data to an Amazon S3 bucket and using AWS Storage Gateway, seems to provide the largest overall cost reduction while meeting the requirements of high-performance access during the job run and minimizing costs when the storage is not actively being used. Therefore, Option D is the most suitable choice.
upvoted 1 times
...
anubha.agrahari
8 months, 2 weeks ago
Selected Answer: A
https://aws.amazon.com/blogs/storage/new-enhancements-for-moving-data-between-amazon-fsx-for-lustre-and-amazon-s3/
upvoted 2 times
...
atirado
11 months, 1 week ago
Selected Answer: A
Option A - This option might work. However, AWS FSx for Lustre does not have a feature called "lazy loading" - its default behavior is to load a file from S3 when it is first accessed (restore). It can provide high-performance as needed though nothing is said in the question about whether a slow initial load time due to restore operations could be an issue. S3 Intelligent-Tiering minimizes storage costs. Option B - This option will provide a high-performance storage option. However, storage in EBS is expensive compared to other AWS storage services Option C - This option might work. However, AWS FSx for Luster does not have a feature called "batch loading". Files can be pre-loaded issuing a hsm-restore command. S3 Standard is a cheap storage option yet not the cheapest option in S3 Option D - This option does not work as described in the option
upvoted 2 times
AimarLeo
9 months, 3 weeks ago
Actually AWS FSx for Lustre does not have a direct feature 'Lazy loading' but the question is the support of that when Amazon FSx will import the objects in our S3 bucket as files, and “lazy-load” the file contents from S3 when first access the files.. Any data processing job on Lustre with S3 as an input data source can be started without Lustre doing a full download of the dataset first - Data is lazy loaded: only the data that is actually processed is loaded, meaning you can decrease your costs and latency
upvoted 1 times
...
...
ninomfr64
11 months, 1 week ago
Not B because using EBS still involves EC2 instances that are expensive (the instances that host the shared file system run continuously). Also, multi-attach is supported only for io1/oi2 EBS disk types that are expensive; Not C as batch loading does not exists in the doc/console, I think they might refer to the option to pre-populate the data using lfs hsm_restore command as mentioned here https://docs.aws.amazon.com/fsx/latest/LustreGuide/preload-file-contents-hsm-dra.html. This would be a more expensive option Not D as Storage Gateway provide less performance than FSx for Lustre and it requires at least an EC2 instance and this will introduce additional cost AA is a viable option as S3 is cheaper storage, FSx for Lustre provides performance. Lazy loading allows to actually move in the filesystem data that is actually used and intelligent tiering make sure those files that are not used are moved to less expensive S3 storage tiers.
upvoted 1 times
...
subbupro
11 months, 3 weeks ago
Intelligent tiering is not required, because the job would be running for every month, so there is no purpose for intelligent tiering, The question is having cost impact also one of the option. So go with option D.
upvoted 1 times
e4bc18e
8 months, 2 weeks ago
"Only a subset of data is accessed each run" So that means after 30 days data can tier down so yes there is cost savings in using INT
upvoted 1 times
...
...
Japanese1
1 year ago
Selected Answer: D
Functional requirements should be met before non-functional requirements. In the first place, only option D allows the application to change the data in the shared file during the monthly job execution. With options A and C, data changes made during the job are discarded after the job runs. On top of that, although D is inferior to A in performance, it meets the requirements because it is the cheapest.
upvoted 5 times
kz407
8 months, 1 week ago
You can configure a DRA for automatic import only, for automatic export only, or for both. "A data repository association configured with both automatic import and automatic export propagates data in both directions between the file system and the linked S3 bucket. As you make changes to data in your S3 data repository, FSx for Lustre detects the changes and then automatically imports the changes to your file system. As you create, modify, or delete files, FSx for Lustre automatically exports the changes to Amazon S3 asynchronously once your application finishes modifying the file." https://docs.aws.amazon.com/fsx/latest/LustreGuide/create-dra-linked-data-repo.html
upvoted 1 times
...
grire974
10 months, 2 weeks ago
Oh yeh - of course; if you delete the FSx volume; the changes are lost.
upvoted 1 times
...
...
bur4an
1 year, 2 months ago
Selected Answer: A
A. Migrate the data from the existing shared file system to an Amazon S3 bucket that uses the S3 Intelligent-Tiering storage class. Before the job runs each month, use Amazon FSx for Lustre to create a new file system with the data from Amazon S3 by using lazy loading. Use the new file system as the shared storage for the duration of the job. Delete the file system when the job is complete. Option B (using Amazon EBS) would result in higher costs due to the continuous usage of large EBS volumes. Similarly, option C involves creating a new FSx for Lustre file system with batch loading, which may not be as cost-effective as lazy loading. Option D (using AWS Storage Gateway) would involve additional complexity and potentially higher costs compared to directly utilizing S3 and FSx for Lustre.
upvoted 1 times
...
dqwsmwwvtgxwkvgcvc
1 year, 3 months ago
Selected Answer: D
@chico already explain the logic behind, @sambb chose A because S3 file gateway wasn't clear to him
upvoted 1 times
...
chiajy
1 year, 3 months ago
Question mentioned "The file system must provide high performance access to the needed data for the duration of the 72-hour run." Assuming S3 Intelligent-Tiering don't move data into Archive Access tiers(which are optional) [Ref: docs.aws.amazon.com/AmazonS3/latest/userguide/intelligent-tiering-overview.html] . Thus, need to ensure data is always in storage tiers that provide "low latency and high throughput performance.". As S3 Intelligent-Tiering delivers automatic storage cost savings, Answer A can be the potential answer.
upvoted 1 times
...
Community vote distribution
A (35%)
C (25%)
B (20%)
Other
Most Voted
A voting comment increases the vote count for the chosen answer by one.

Upvoting a comment with a selected answer will also increase the vote count towards that answer by one. So if you see a comment that you already agree with, you can upvote it instead of posting a new comment.

SaveCancel
Loading ...