Your team is running microservices in Google Kubernetes Engine (GKE). You want to detect consumption of an error budget to protect customers and define release policies. What should you do?
A.
Create SLIs from metrics. Enable Alert Policies if the services do not pass.
B.
Use the metrics from Anthos Service Mesh to measure the health of the microservices.
C.
Create a SLO. Create an Alert Policy on select_slo_burn_rate.
D.
Create a SLO and configure uptime checks for your services. Enable Alert Policies if the services do not pass.
The best answer is C. Create a SLO. Create an Alert Policy on select_slo_burn_rate. Here's why:
SLOs (Service Level Objectives): SLOs are crucial for defining the acceptable performance levels of your microservices. They help you set clear targets for things like latency, availability, and error rates.
Error Budget: An error budget is a defined amount of "acceptable" errors or performance degradation within a given time period. It allows for some flexibility while still ensuring overall service health.
Alerting on Burn Rate: The select_slo_burn_rate metric in Cloud Monitoring allows you to track how quickly your error budget is being consumed. By creating an alert policy based on this metric, you can be notified when the burn rate exceeds a predefined threshold, indicating a potential risk of exceeding your error budget.
Why other options are less suitable:
A. Create SLIs from metrics. Enable Alert Policies if the services do not pass: While creating SLIs is a good first step, it doesn't directly address the error budget consumption. Alerting on individual SLIs might not be sufficient to protect against exceeding the overall error budget.
B. Use the metrics from Anthos Service Mesh to measure the health of the microservices: Anthos Service Mesh provides valuable metrics, but it doesn't inherently handle error budget management. You'll still need to define SLOs and create alerts based on the burn rate.
D. Create a SLO and configure uptime checks for your services. Enable Alert Policies if the services do not pass: Uptime checks are important for availability, but they don't directly monitor error budget consumption. You need a mechanism to track the burn rate of your error budget, which is best achieved through SLOs and the select_slo_burn_rate metric.
This approach involves defining specific SLOs for your services, which are quantitative measures of the desired reliability of a service. Once you have these SLOs, you can set up Alert Policies based on the rate at which your error budget is consumed (burn rate).
Both option C & D are effective in detecting consumption of error budget, but they have different strengths and weaknesses.
Creating an SLO and configuring uptime checks is a good way to get a high-level view of the health of your services. It can also help you to identify trends over time. However, it can be difficult to configure uptime checks for complex services, and it may not be possible to detect all types of errors.
Using select_slo_burn_rate is a more granular way to detect consumption of error budget. It can be used to monitor individual SLOs and to identify specific types of errors. However, it can be more difficult to set up and to interpret the results.
using metrics from Anthos Service Mesh, which can be helpful for monitoring, but it lacks the explicit focus on SLOs, uptime checks, and Alert Policies for managing error budgets and protecting customers.
Correct Answer is D. Create a SLO and configure uptime checks for your services. Enable Alert Policies if the services do not pass.
upvoted 3 times
...
Log in to ExamTopics
Sign in:
Community vote distribution
A (35%)
C (25%)
B (20%)
Other
Most Voted
A voting comment increases the vote count for the chosen answer by one.
Upvoting a comment with a selected answer will also increase the vote count towards that answer by one.
So if you see a comment that you already agree with, you can upvote it instead of posting a new comment.
thewalker
4 months, 1 week agothewalker
4 months, 1 week agoalpha_canary
9 months, 3 weeks agofilipemotta
11 months, 4 weeks agoAndrei_Z
1 year agomshafa
1 year agokoo_kai
1 year, 1 month agoManishKS
1 year, 1 month ago