Product team Performance

Reliability and Speed (Observability SLOs)

Availability

New development 0%

Target:

99.9%

Actual: Nov'25

TBA%

Objective: The API gateway is up 99.9% of the time each month which translates to, at most, 44 minutes of unplanned downtime per month.

Method: To measure this we will measure uptime from synthetic checks every 1 minute from multiple regions.

Speed P95 Latency

High speed 100%

Target:

250 ms

Actual: Nov'25

2 mins

Objective: 95% of requests finish in under 250 ms (P95 latency) meaning almost all user clicks feel instant and only the slowest 5% may take longer.

Method: To measure this we will look at latency and errors from request traces and logs at the gateway (and per domain
service).

Error Budget

Over budget 100%

Target:

39 mins

Actual: Nov'25

300 mins

Objective: We “spend” at most 0.1% of the month on errors/outages. If reliability slips past this allowance then we would pause new launches and fix stability first. The goal being to create customer and team trust, reduce churn, and avoid support blowups during demos or new feature releases.

Method: Manual recording initially

Cost Guardrails (Cost KPIs)

Cloud spend vs. budget

New development 0%

Target:

TBA

Actual: Nov'25

TBA

Objective: Total monthly cloud spend vs. budget (by environment and by product/brand) – simple measure of stay within 100% of budget

Monthly` Spend

Overall spend 10%

Target:

$18,300

Actual: Dec'25

TBA

Objective: Keep team costs within budget.