MLOps Engineer / AI Infrastructure Specialist OC-29 Find Job Latam

Lima, Peru

MLOps Engineer / AI Infrastructure Specialist OC-29

Job Description:

Technologies: Kubernetes, AWS SageMaker, MLflow

Our partner is looking for talented professionals ready for the next step in their careers. This role offers a collaborative environment with meaningful challenges and rewarding growth opportunities.

As a MLOps Engineer / AI Infrastructure Specialist, you'll support multiple projects, collaborate with cross-functional teams, and communicate progress transparently. Ideal candidates enjoy solving complex problems, helping teams succeed, and pushing themselves to deliver high-impact infrastructure.

Job Summary

Join an advanced AI/ML team where youll architect, automate, and scale machine learning infrastructure. This position is perfect for someone passionate about MLOps, DevOps, and production-grade AI systems.

Responsibilities

Design, implement, and maintain scalable MLOps pipelines for training, evaluation, and deployment.
Automate workflows using CI/CD tools (GitLab, Jenkins, GitHub Actions).
Manage containerized environments with Docker and orchestrate deployments via Kubernetes.
Partner with data scientists and engineers to streamline experimentation and productionize ML models.
Deploy, monitor, and manage models on cloud ML platforms (AWS SageMaker, Azure ML, Vertex AI).
Ensure reliability, monitoring, versioning, and automated rollback of ML systems.
Maintain reliability and punctuality in remote team environments.

Requirements

English proficiency B2+ (written and spoken).
8+ years of experience as an MLOps Engineer / AI Infrastructure Specialist.
Strong punctuality and reliability for meetings.
Proficient in Python, with experience deploying models using TensorFlow and/or PyTorch.
Hands-on experience with Docker and Kubernetes.
Strong background in CI/CD pipeline implementation.
Proven experience with cloud ML platforms (AWS SageMaker, Azure ML, or Vertex AI).

Nice to Have

Experience with workflow orchestration tools (Kubeflow, Airflow) or platforms like Databricks.
Familiarity with monitoring and IaC tools (Prometheus, Grafana, Terraform).
Experience with data versioning tools such as DVC or LakeFS.

Position Details

Type: Full-time consultancy
Hours: Up to 40 hrs/week
Location: 100% remote (LATAM)
Schedule: Flexible core hours

Required Skills:

Grafana PyTorch Offers TensorFlow Pipelines CI/CD Azure Gitlab DevOps Reliability AWS Machine Learning Infrastructure Kubernetes Jenkins Github Docker Design Python English Training