About the job ML/MLOps Engineer (Mid-Senior)
Function: Data and AI Delivery
Reports to: Head of Data and AI Delivery
Type: Contract (12 months with possibility of extension)
Location: Cape Town, Northern Suburbs (Hybrid)
COMPANY:
Vito Solutions is a Data and AI Intelligence firm founded in 2013, with offices in Cape Town and New York. We build production-grade Data and AI systems for clients across South Africa, the UK, Europe, and the USA. Our team works on revenue-generating use cases including fraud detection, churn modelling, real-time analytics, AI agents, and full data platform builds. Data and AI Intelligence is what we do, and it is how we deliver on every client engagement.
THE ROLE:
We are hiring a Senior ML/ MLOps Engineer to take technical ownership of the ML infrastructure underpinning our client engagements. This is a platform and systems role, not a research role. We want a software engineer who has chosen to specialise in ML systems, not a data scientist who has drifted into infrastructure.
You will be the person clients depend on to keep their ML platforms stable, scalable, and cost-efficient in production. You will set the engineering bar for how Vito delivers ML systems, and you will lift the delivery teams around you to hit that bar.
WHAT YOU WILL DO:
- Own the architecture of ML platforms on client engagements, including API design, deployment topology, and cloud infrastructure on a major hyperscaler (GCP)
- Automate provisioning and environment management using Infrastructure as Code (Terraform), and ship code through modern CI/CD pipelines
- Design and build internal frameworks and tooling that let client data scientists and engineers move models into production safely and quickly
- Hold the line on production reliability, security, and scale across every ML system Vito delivers
- Run architectural reviews on client work, enforcing clean code, SOLID principles, and pragmatic engineering trade-offs
- Identify and remove cost waste across ML, AI, and data workloads, and report savings back to the client as a measurable outcome
WHAT YOU NEED (MUST HAVE):
- Bachelor's degree in Computer Science, Software Engineering, or a closely related field
- At least 4 years in production software engineering, MLOps, or platform engineering
- Strong hands-on architecture experience on at least one major cloud (GCP, AWS, or Azure), specifically running ML workloads in production
- Production track record on a managed ML platform (Vertex AI, SageMaker, Bedrock, Azure ML, or Databricks)
- Practical experience with managed container or serverless compute services (Cloud Run, AWS Fargate, ECS, Azure Container Apps, or equivalent)
- Working knowledge of a cloud data warehouses like (BigQuery, Snowflake, Redshift, Databricks SQL, MS Fabric, Azure Synapse)
- Solid CI/CD experience with one or more of GitLab CI, GitHub Actions, Jenkins, Azure DevOps, CircleCI, or Harness
- Strong Infrastructure as Code experience with Terraform, Pulumi, CloudFormation, or Bicep
- Advanced Python for backend and platform work, plus advanced SQL for data work
- Production experience with Docker and Kubernetes
- Track record of building and operating ML pipelines and observability tooling (Prometheus, Grafana, Datadog, OpenTelemetry, or similar)
BONUS POINTS FOR:
- Direct production experience on Google Cloud Platform
- Specific exposure to Vertex AI, Cloud Run, and BigQuery in production
- CI/CD work with GitLab and Harness
- Consulting or client-facing delivery background
- Domain exposure to retail, banking, insurance, or asset management data
WHAT WE LOOK FOR:
- Systems engineer by instinct. You think about ML in terms of contracts, interfaces, failure modes, and observability, not notebooks
- Mentor by default. You raise the level of the engineers around you, especially through documentation and shared tooling
- Calm under complexity. You can untangle messy production pipelines across unfamiliar cloud setups and explain what is broken in plain language
- Tool-agnostic. You have a preferred stack but you can read a client's environment and adapt
**Please note: If you have not heard from us within 2 weeks, please consider your application unsuccessful.