About the job SRE (R98Y9R)
Site Reliability Engineer (R98Y9R)
South Geeks engages top performance Software Developers from Latin America to join our clients’ teams worldwide. We build amazing products and sustain long-term relationships with our counterparts. We pride ourselves on being a socially responsible company. The results are seen in the performance of our teams and the bond we hold with each of our clients.
About The Company
Our customer is one of the leading communication platforms in education. It helps educators reach students and parents where they are: their phones. With over 30 million active users, they are one of the fastest-growing companies in education technology, but we have our sights set on something bigger: giving every student the opportunity to succeed.
About This Role
The Engineering Team collaborates to deliver features for users and customers while setting and maintaining SLAs to ensure reliable system performance.
"We prefer strongly typed languages over dynamic for critical business systems, and leverage both relational and non-relational data structures as needed, supporting tens of thousands of requests per second. We bias towards using the right tool for the job, including Typescript, Python, Go, Ruby, Twirp, GraphQL, and many AWS services (Aurora, Lambda, DynamoDB, SQS, Kinesis)".
As a SRE, you'll collaborate with the product engineering teams, as well as cross-functional teams, to maximize site availability, performance, and uptime, as well as build systems and features to enable engineers to ship more quickly and more confidently.
- You have consistently shipped high-quality code to production as part of a team
You collaborate effectively with engineers and product managers to build systems to increase the leverage of product engineering teams and improve the security, stability, and efficiency of production systems
You write clean code and have significant experience with one or more programming languages
You understand the value of an appropriately defined SLA/SLO for both internal and external systems and services and have experience building highly available systems and services which scale and perform in accordance with such an SLA/SLO
Others enjoy working with you because of your positive attitude and technical competence
What You'll Do
- Increase the overall availability and performance of the customer's distributed services
- Support uptime through participation in the customer's eng-wide on-call rotation
- Help establish, conform to, and audit our SLAs/SLOs so that the performance of the website exceeds the expectations of students, parents, and educators in even the largest and most demanding school districts
- Improve our deployment process to make it as fast and predictable as possible
- With product engineering teams, debug production issues across services and levels of the stack
- Bring ops perspective to engineering, and engineering perspective to ops
- Partner with product engineering teams, to ensure the security, stability, performance, and cost-efficiency of Remind’s services
- Ensure infrastructure priorities are reflected in our engineering roadmap
- Maintain open source infrastructure projects.