Exam Professional Cloud Architect All Questions

View all questions & answers for the Professional Cloud Architect exam

Exam Professional Cloud Architect topic 1 question 21 discussion

Actual exam question from Google's Professional Cloud Architect

Question #: 21
Topic #: 1

[All Professional Cloud Architect Questions]

Your company's user-feedback portal comprises a standard LAMP stack replicated across two zones. It is deployed in the us-central1 region and uses autoscaled managed instance groups on all layers, except the database. Currently, only a small group of select customers have access to the portal. The portal meets a
99,99% availability SLA under these conditions. However next quarter, your company will be making the portal available to all users, including unauthenticated users. You need to develop a resiliency testing strategy to ensure the system maintains the SLA once they introduce additional user load.
What should you do?

A. Capture existing users input, and replay captured user load until autoscale is triggered on all layers. At the same time, terminate all resources in one of the zones
B. Create synthetic random user input, replay synthetic load until autoscale logic is triggered on at least one layer, and introduce ג€chaosג€ to the system by terminating random resources on both zones
C. Expose the new system to a larger group of users, and increase group size each day until autoscale logic is triggered on all layers. At the same time, terminate random resources on both zones
D. Capture existing users input, and replay captured user load until resource utilization crosses 80%. Also, derive estimated number of users based on existing user's usage of the app, and deploy enough resources to handle 200% of expected load

Show Suggested Answer

Suggested Answer: B 🗳️

by KouShikyou at Oct. 24, 2019, 1:54 p.m.

Comments

Submit Cancel

jcmoranp

Highly Voted 5 years, 8 months ago

resilience test is not about load, is about terminate resources and service not affected. Think it's B. The best for resilience in to introduce chaos in the infraestructure

upvoted 95 times

rockstar9622

5 years, 5 months ago

I agree with @jcmoranp, B) is correct for more info - https://cloud.google.com/solutions/scalable-and-resilient-apps#test_your_resilience

upvoted 20 times

...

AWSPro24

3 years, 7 months ago

Isn't A superior in one way. It will demonstrate that the app is regionally redundant by demonstrating it can survive the loss of an entire zone. B only demonstrates the app is zonally redundant and can lose a random instance here and there within individual zones which is not that resilient. Thoughts?

upvoted 8 times

0xE8D4A51000

2 years, 8 months ago

No. It is only terminating the service in ONE zone. B caters for terminating the service in both zones randomly. You want to be able to test resiliency when either zone has an outage.

upvoted 7 times

...

OSNG

Highly Voted 4 years, 7 months ago

Will go with A. Reason: 1. SLA in question is about the Availability (The portal meets a 99,99% availability SLA under these conditions.) therefore maintaining SLA means Availability. 2. Its a user-feedback portal and type of user input is going to be similar or same (A is capturing the user input and replaying it). Why not B: The infrastructure is using MIG (Instances created using templates) most likely to be used with Health Check and killing random VMs cannot test the availability (neither affect the availability as health check will immediately kill the effected Instances and create the other one.) Why not D: SLA is about Availability not reliability or scaling. (As all of it does work hand to hand but still major focus should be on availability.) --- IF AGREE PLEASE UP VOTE TO MAKE IT CLEAR FOR THE OTHERS --- Thank you.

upvoted 62 times

RitwickKumar

2 years, 10 months ago

Only problem with A is that it says "replay captured user load". We are not testing for the incoming unpredictable load due to the inclusion of unauthenticated users and something that we haven't captured earlier. Option B covers breadth and depth for the desired SLA.

upvoted 10 times

jay9114

2 years, 10 months ago

What does "replay captured user load" mean?

upvoted 3 times

...

bolu

4 years, 6 months ago

valuable input in terms of 'availability'. did you select this answer in exam too?

upvoted 1 times

...

amxexam

3 years, 10 months ago

We are talking about resilience testing where as SLA is an argument of the system.

upvoted 1 times

amxexam

3 years, 10 months ago

And resilience means the capacity to recover from failure.

upvoted 1 times

...

AWSPro24

3 years, 7 months ago

A ensures the app can withstand the loss of a whole Zone which I think is important as well.

upvoted 1 times

...

Load full discussion...

...

scialappa

Most Recent 3 months, 1 week ago

Selected Answer: C

I'm a bit puzzled by none selecting C. Why shouldn't C be the best options. Aren't ABD only partially fulfilling the requirements, whilst C marks them all? Real users to test (most realistic), allows stress testing by increasing load gradually, randomly shuts resources so includes Cahos Eng. Any feedback here please?

upvoted 1 times

...

david_tay

4 months, 2 weeks ago

Selected Answer: A

Answer is likely A as Gemini said so, and some others here agree on the logic.

upvoted 1 times

...

FabPanda

5 months, 2 weeks ago

Selected Answer: B

To perform resilience test and auto scale testing B is correct choice.

upvoted 1 times

...

JonathanSJ

6 months, 1 week ago

Selected Answer: B

I will go for B

upvoted 1 times

...

Ekramy_Elnaggar

7 months, 4 weeks ago

Selected Answer: B

1. Synthetic Load Generation: Creating synthetic user input allows you to simulate a wide range of user behavior and load patterns, including spikes and sustained high traffic. This helps you test the system's ability to scale and handle unexpected loads. 2. Autoscaling Validation: By replaying the synthetic load, you can verify that the autoscaling logic is working correctly across all layers of the LAMP stack. This ensures that the system can dynamically adjust resources to meet demand. 3. Chaos Engineering: Introducing chaos by terminating random resources simulates real-world failures and helps you test the system's resilience to unexpected disruptions. This is crucial for maintaining the 99.99% availability SLA. 4. Controlled Environment: This approach allows you to conduct testing in a controlled environment without impacting real users. You can gradually increase the load and introduce chaos in a measured way to identify weaknesses and improve resilience.

upvoted 2 times

...

potorange

10 months, 2 weeks ago

Selected Answer: A

A: requirements states "all layers" and "resiliency testing"

upvoted 1 times

...

Robert0

1 year, 1 month ago

Selected Answer: A

I would go with A. This solution test autoscale policy of each layer (not only one as option B refers). Also, it propose a regional shutdown. This is a very good test commonly requested if your application is geo-redundant. In crontast, option B propose random termination of resources, not a bad practice but a little bit vague that can be implemented terrible wrong (for example you do not kill the interesting services or you kill the same service in both regions, thus generating a blackout)

upvoted 1 times

...

lisabisa

1 year, 4 months ago

Selected Answer: B

We need to do 1. load testing and 2. reliability test ( failover redundency ) B does both A only tests one zone C impacts real user experience D 200% not necessary

upvoted 4 times

...

AdityaGupta

1 year, 9 months ago

Selected Answer: A

You need to develop a resiliency testing strategy to ensure the system maintains the SLA once they introduce additional user load. Need to maintain SLA of 99.9% means multiple zones, resilience means fault tolerance. Teminating all resources in one zone is also creating a chaos.

upvoted 1 times

...

heretolearnazure

1 year, 10 months ago

B is correct

upvoted 1 times

...

VaraSrinvas

2 years ago

Selected Answer: D

Option D is the best resiliency testing strategy in this scenario as it ensures that the system is tested with actual user data, takes into account the expected increase in user load, and ensures that the system is adequately scaled to handle the anticipated load.

upvoted 2 times

didek1986

1 year, 10 months ago

Do not agree. B is 100% correct

upvoted 1 times

...

jrisl1991

1 year, 9 months ago

But this would assume that the user load will not change; plus, the current application is visible only to a small group of select customers - this is the current production setup. The deployment should be prepared for all existing users plus unauthenticated users, and the load increase is unknown, so testing for 200% of "expected load" is very ambiguous.

upvoted 1 times

...

JC0926

2 years, 2 months ago

Selected Answer: B

B. Create synthetic random user input, replay synthetic load until autoscale logic is triggered on at least one layer, and introduce ג€chaosג€ to the system by terminating random resources on both zones. By creating synthetic random user input and replaying the load, you can simulate the expected increased user traffic and trigger the autoscale logic on different layers of the application. Introducing chaos to the system by terminating random resources in both zones helps test the resiliency and redundancy of the system under stress. This strategy will help ensure that the system can maintain the 99.99% availability SLA when subjected to additional user load.

upvoted 5 times

...