Job Openings
MLOps / Cloud Engineer
About the job MLOps / Cloud Engineer
We are looking for an experienced MLOps / Cloud Engineer with a strong background in building and operating cloud-based AI/ML platforms in production environments. The role focuses on designing scalable infrastructure, enabling end-to-end ML workflows, and supporting modern GenAI/LLM solutions.
Start Date: ASAP
Location: Remote (EU-based)
Language: English
Contract Type: B2B
Responsibilities:
- Design, build, and operate cloud-based AI/ML platforms in production environments
- Develop and maintain scalable MLOps pipelines for end-to-end ML workflows
- Implement and optimize CI/CD pipelines for ML and software delivery (e.g., GitHub Actions)
- Manage and provision infrastructure using Infrastructure as Code (Terraform)
- Deploy, manage, and optimize containerized applications using Docker and Kubernetes (EKS)
- Work with AWS and Azure services, including ML services (e.g., SageMaker, Bedrock)
- Implement monitoring, logging, and alerting solutions (Prometheus, Grafana, Loki, ELK)
- Ensure security best practices across cloud infrastructure and CI/CD pipelines
- Support model lifecycle management including model registry, performance monitoring, and data quality tracking
- Collaborate with cross-functional teams to deliver robust and scalable AI/ML solutions
- Analyze existing codebases and suggest improvements and refactoring where needed
Requirements:
- Hands-on experience with AWS and/or Azure cloud platforms
- Proven experience with Kubernetes and Docker in production environments
- Strong knowledge of Terraform (Infrastructure as Code)
- Experience with CI/CD pipelines (e.g., GitHub Actions)
- Proficiency in Python and solid understanding of software engineering principles and architecture
- Experience with LLM / GenAI solutions and ML platforms (e.g., SageMaker, Bedrock)
- Strong understanding of ML concepts and algorithms, with practical implementation experience
- Experience with MLOps tooling and architecture (e.g., Kubeflow, model registry, monitoring)
- Knowledge of monitoring and logging tools (Prometheus, Grafana, Loki, ELK)
- Understanding of security best practices in cloud and DevOps environments
Nice to Have:
- Experience with enterprise-scale projects and environments
- Familiarity with advanced Kubernetes features (e.g., operators)
- Experience with performance optimization of Docker images
- Exposure to tools like Dynatrace