Description:
The Role: Ensure our services are properly instrumented, resilient against outages, responsive to appropriate mediations, and monitored at various scopes to track uptime and stability. Partner with the engineering, product, and data teams to ensure our systems are appropriately provisioned to meet the growing needs of a rapidly expanding company. Define, measure, and meet SLA/SLOs focusing on availability, performance, incidents,and chronic quality issues. Get into deeper insights into application performance …