Job Openings DevOps Engineer

About the job DevOps Engineer

Responsibilities

  • Design, implement, and support scalable, reliable infrastructure to power production and development environments.
  • Manage and enhance our container orchestration systems, with a focus on Kubernetes (EKS), while maintaining a balanced view of other critical AWS services such as EC2, ALB, IAM, and VPC networking.
  • Build and maintain automation for application and infrastructure deployment, scaling, and lifecycle management.
  • Partner with software engineering teams to improve build, release, and deployment processes across CI/CD pipelines.
  • Monitor and improve system availability, latency, and performance across the full stack from cloud infrastructure to backend services.
  • Develop internal tools and scripts to enhance operational efficiency, resilience, and security.
  • Play a key role in incident response efforts, including root cause analysis and long-term remediation.
  • Participate in architecture reviews and help guide decisions on infrastructure design, resilience, and observability.
  • Stay informed on industry trends in reliability engineering, cloud-native tooling, and DevOps practices, and integrate improvements into our operational playbook.
  • Champion security, scalability, and cost-efficiency in all infrastructure decisions.

Requirements

  • 5+ years of experience in a DevOps, SRE, or infrastructure engineering role supporting production systems at scale.
  • Hands-on experience managing containerized applications using Kubernetes, preferably AWS EKS, but with understanding of broader infrastructure ecosystems.
  • Strong knowledge of AWS services and how they integrate to support modern cloud architectures.
  • Proficiency with Infrastructure as Code (IaC) tools such as Terraform, and configuration management tools.
  • Experience designing and supporting CI/CD pipelines (e.g., Jenkins, GitHub Actions, ArgoCD, etc.).
  • Scripting or programming skills in Python, Go, or similar languages, used for automation and tooling.
  • Deep understanding of systems observability, including logging, metrics, and tracing (e.g., Prometheus, Grafana, CloudWatch).
  • Ability to diagnose and troubleshoot complex issues across distributed systems, including performance bottlenecks and availability challenges.
  • Familiarity with security best practices for cloud and containerized environments.
  • Clear and proactive communicator, comfortable working cross-functionally in a fast-paced environment.

Set up: Remote

Shift: Night Shift

By Applying, you give consent to collect, store, and/or process personal and/or sensitive information for the purpose of recruitment and employment may it be internal to Cobden & Carter International and/or to its clients.