Job Openings
G23 - DevOps Engineer
About the job G23 - DevOps Engineer
We are seeking a skilled and passionate DevOps Engineer to design, build, and operate a platform. You will work across GitLab, AWS, Kubernetes (EKS), and Terraform to create a secure, scalable, and developer-friendly platform.
This role requires strong software engineering fundamentals, systems thinking, and a passion for automation and platform excellence.
Responsibilities:
Platform Design, Build & Operations
- Architect, implement, and operate platform components across GitLab, AWS, Kubernetes (EKS) and Terraform-backed infrastructure.
- Build reliable, scalable, and secure services that power the CStack and Airbase platform.
- Continuously improve platform performance, cost efficiency, and operational robustness.
Developer Experience & Productivity
- Design and build workflows, tooling, and paved paths that enable tenants to deploy and operate applications quickly, safely, and consistently.
- Champion automation and self-service capabilities to minimise friction for application teams.
Automation & Toil Elimination
- Identify operational bottlenecks and repetitive tasks; build automation via CI/CD, Infrastructure-as-Code, and controllers/operators.
- Partner with other engineers in the team to enable highly automated, low-touch operations.
Observability & System Health
- Implement and maintain comprehensive observability (logs, metrics, traces, alerts) aligned with the Four Golden Signals
- Build automation for proactive health checks, anomaly detection, and self-healing where practical.
Production Support & Incident Management
- Participate in on-call rotations; respond to incidents to minimise MTTR.
- Lead or contribute to post-incident reviews, drive actionable follow-ups, and improve platform resilience.
Security & Compliance
- Embed security into every layer of the platform through secure defaults, policy enforcement, vulnerability scanning, and AWS/K8s best practices.
- Collaborate with security teams to meet regulatory and compliance requirements.
Performance, Optimisation & Reliability
- Diagnose and resolve performance bottlenecks across the stack (AWS, Kubernetes, workloads, networking).
- Define, measure, and improve KPIs such as MTTR, error rates, SLO compliance, and cost efficiency.
Strategic Tenant Engagement
- Act as a technical advisor for tenants on containerisation, CI/CD, and cloud-native deployment best practices.
- Participate in architecture reviews and guide tenants toward stable, secure, and scalable patterns.
Knowledge Sharing & Documentation
- Maintain high-quality documentation, runbooks, playbooks, and architecture diagrams.
- Ensure operational and onboarding processes are well-understood across the team.
Continuous Learning & Innovation
- Stay current with the latest in cloud-native, Kubernetes, GitLab, and AWS ecosystems.
- Propose new approaches, technologies, and improvements that enhance platform capabilities.
Requirements
- Degree or diploma in Computer Science, Engineering, or relevant experience.
- Proven experience as a DevOps / Platform / SRE engineer working in cloud-native environments.
- Strong understanding of AWS, Kubernetes (especially EKS), container orchestration, and cloud operations.
- Proficiency with IaC tools (Terraform preferred) and cloud integrations (e.g., AWS Load Balancers, Secrets Manager).
- Ability to troubleshoot complex issues across distributed systems, containers, and underlying infrastructure.
- Hands-on experience with Kubernetes tooling: Helm, Kustomize, kubectl, ecosystem APIs.
- Experience managing databases (e.g., PostgreSQL, or cloud-managed equivalents), including backup/restore processes, snapshot automation, recovery testing, and ensuring data durability.
- Good software engineering fundamentals with ability to write production-grade code in Go, TypeScript, or Python.
- Understanding of how web applications work (Go, Python, React/NextJS).
- Familiarity with automated testing (unit, integration, end-to-end) for tools and automation you build.
- Experience with GitOps/CI/CD tools (GitLab CI/CD, ArgoCD).
- Experience with observability systems (Prometheus, Grafana, ELK/Elastic Stack) and SLO design.
- Strong understanding of networking, security, and storage in both AWS and Kubernetes contexts.
Bonus Skills
- Certified Kubernetes Administrator (CKA) or CKAD.
- Experience building Kubernetes operators/controllers in Go.
- Experience with service mesh technologies.
- Experience with chaos engineering or large-scale reliability testing.