Infrastructure Architect

Job Description:

Purpose of the Role

Design, implement, and maintain secure, scalable, and cost-effective cloud infrastructure. This role ensures long-term cloud sustainability through FinOps, cost optimization, automation, and resilient architectures that support business growth, reliability, and operational efficiency.

Key Responsibilities

Design and implement scalable, secure, cost-efficient cloud infrastructure.
Lead cloud cost-optimization using FinOps principles and long-term commitments.
Architect cloud solutions for sustainability and economies of scale.
Configure and manage compute, networking, storage, and monitoring tools.
Automate provisioning, deployment, and maintenance using IaC.
Work closely with DevOps and Engineering to ensure performance and high availability.
Monitor infrastructure health, optimize resource usage, and resolve performance issues.
Implement strong cloud security, encryption, and compliance standards.
Evaluate and recommend new cloud services and technologies.

Minimum Requirements

Bachelor's degree in Computer Science / Information Technology / or related field.
7+ years in infrastructure engineering or similar roles.
3–5+ years hands‑on experience designing and managing secure, scalable cloud environments (AWS, Azure, or GCP).
Strong understanding of cloud architecture, networking, security, and FinOps.
Experience with Infrastructure as Code (e.g., Terraform, CloudFormation, ARM/Bicep).
Relevant certifications beneficial (AWS Solutions Architect, Azure Architect, FinOps Certified Practitioner).
Strong analytical, problem‑solving, communication, and collaboration skills.

Key Performance Measures

Cloud Cost Efficiency: Savings via right‑sizing, Reserved Instances/Savings Plans, and FinOps reporting.
Scalability & Elasticity: Ability to scale environments with minimal manual intervention.
Security & Compliance: Effective implementation of security controls and audit readiness.
System Uptime: Meeting or exceeding cloud uptime SLAs.
Incident Response (MTTR): Speed and effectiveness in detecting and resolving incidents.
Automation: Level of automation improving deployment velocity and reducing manual tasks.
Resource Utilization: Efficient CPU, memory, storage, and network usage.
Disaster Recovery Readiness: Achieving target RPO/RTO and successful DR test results.
360° Internal Feedback: Collaboration and stakeholder satisfaction across teams.

Working Place:

Bryanston

< back to jobs opening