About the job Sr. DevOps Engineer
About the Role
We are looking for a Senior DevOps Engineer to help scale the cloud architecture, deployment systems, and operational reliability behind AI-driven platforms and computer vision infrastructure. This role is ideal for someone with strong cloud infrastructure and automation experience who can operate across distributed systems, edge-server environments, and scalable production platforms. The ideal candidate is ownership-oriented, pragmatic, and comfortable working in fast-moving environments with evolving operational needs. This position requires strong collaboration with Backend, Frontend, Product, and Computer Vision teams to ensure stable, scalable, secure, and reliable systems.
Key Responsibilities
Design, manage, and improve scalable cloud infrastructure.
Ensure production systems are reliable, secure, highly available, and operationally visible.
Support infrastructure powering web applications, backend systems, analytics platforms, and AI-driven systems.
Build and maintain Infrastructure as Code using Terraform.
Create reusable, modular infrastructure components and automate provisioning workflows.
Build and improve CI/CD pipelines, deployment systems, rollback strategies, and release workflows.
Implement monitoring, observability, and alerting systems.
Investigate and resolve production incidents and infrastructure issues.
Participate in root cause analysis and long-term operational improvements.
Support distributed systems, edge-server integrations, and scalable backend services.
Collaborate with teams working on Python-based computer vision pipelines.
Improve infrastructure security, access controls, scalability, performance, and cost efficiency.
Must-have Requirements
5+ years of DevOps or Cloud Infrastructure experience.
Strong experience with Google Cloud Platform cloud infrastructure.
Strong hands-on experience with Terraform and Infrastructure as Code.
Experience supporting scalable production systems and distributed architectures.
Experience building and maintaining CI/CD pipelines.
Deep understanding of cloud architecture and distributed systems.
Strong Infrastructure as Code practices and modular Terraform design.
Experience with monitoring, observability, and incident management.
Strong troubleshooting and operational debugging skills.
Experience managing production deployments and release workflows.
Understanding of networking, scalability, and infrastructure security best practices.
Ability to automate operational processes and improve platform reliability.
Experience working in fast-paced product or startup environments.
Strong communication skills in cross-functional environments.
Ability to work autonomously in remote environments.
Ownership mentality and proactive operational mindset.
Fluent English.
Nice-to-have Requirements
Experience with AI-driven or computer vision platforms.
Experience supporting edge or serverless architectures.
Familiarity with Python-based systems.
Experience with real-time systems and analytics platforms.
Experience with Kubernetes or container orchestration systems.
Experience with Docker.
Experience with GitHub Actions or CI/CD tooling.
Experience with monitoring and observability tools such as DataDog, CloudWatch, or Grafana.
Familiarity with Linux-based systems administration.
Familiarity with TypeScript, Node.js, PostgreSQL, or Next.js-based platforms.
Familiarity with Node.js and TypeScript-based edge-server integrations.