About the job Devops Team Lead Full Remote-X959VW9R
The Team
The run team is responsible for all operational aspects of the managed services platform, who will run things that the engineering team build, however it is an expectation that the run team contribute to improvements of the platform, especially where it is solving operational issues, fixing monitoring an so-on.
Day to day:
o Resolving Incidents o Deploying new versions of software o Database Upgrades o Improving monitoring dashboards, alerting, etc. o Daily, Weekly, Monthly Check Lists o Onboarding new customers o Disaster Recovery Testing o High Availability Testing o Cost Optimisations
Observability:
- Implement and Own the observability platform, in collaboration with the engineering team.
- Create & Contribute to monitoring dashboard o Create & Contribute to monitoring capability
Additionally:
o Contribute to Improvements of the platform o Enhance or fix automation
Tech Stack: AWS, Kubernetes, Oracle & PostgreSQL, Python, Java Applications
Each team will consist of:
- 1 x Team Lead
- 2 x DevOps/SRE (1 mid, 1 senior)
- 2 x Database Engineers (1 mid, 1 senior)
The Person
- The team lead will be responsible to own and implement the operating model for run, working closely with the head of platform, and engineering team lead.
- The person should be both technically very strong, but also a strong leader to shape the strategy, processes and operation model for the running of our platform.
- Lead a team of 4 people (5 including him);
- The person should normally be 50% hand on, 50% focused on leading - this might change depending on the period.
- This person should be both technically leading, and leading the people and process
Roles & Responsibilities
- Design, Contribute, Implement & Own operational processes.
- Assign tasks to engineers as required o Line Management of the team, including hiring, performance reviews, etc.
- Work with the Engineering Team Lead to improve the platform
- Be part of an on-call group (team lead, escalation level)
- Work with customers as required to resolve their issues o Collaborate with Internal development teams as required
Experience/Skills
- Must Have Extensive experience leading operational teams
- Extensive understanding of AWS Cloud Extensive with IaC (Terraform)
- Extensive experience with kubernetes
- Extensive experience with git version control Good level of scripting experience (Ideally Python)
- Good understanding of databases, including Oracle and PostgresSQL
Nice to Have: Karpenter FluxCD Ideally experience with Liquibase