About the job M02 - Operations Support Engineer
Responsibilities
-
Design, build, and maintain cloud infrastructure across development, staging, and production environments.
-
Manage compute, storage, networking, containerisation, virtualisation, DNS, and monitoring platforms.
-
Implement monitoring and observability solutions using tools such as AWS CloudWatch, Prometheus, Grafana, and ELK.
-
Ensure compliance with security and government platform standards, including access control and system hardening.
-
Automate infrastructure using Infrastructure-as-Code tools such as Terraform, Ansible, and AWS CloudFormation.
-
Improve system reliability through SRE practices, including monitoring SLIs, SLOs, and error budgets.
-
Manage AWS services (EC2, ECS, S3, RDS, Lambda, VPC, IAM) and container platforms (Docker/Kubernetes).
-
Maintain networking services (TCP/IP, DNS, DHCP, VPN) and perform system patching and maintenance.
-
Support backup, disaster recovery, and high availability solutions.
-
Collaborate with application teams and maintain documentation, runbooks, and operational procedures.
Requirements
-
Experience with virtualisation platforms (VMware vSphere, Hyper-V).
-
Strong Linux and Windows Server administration skills.
-
Experience with cloud platforms (AWS preferred).
-
Familiarity with container technologies (Docker, Kubernetes, ECS).
-
Knowledge of monitoring and observability tools (CloudWatch, Prometheus, Grafana).
-
Experience with Infrastructure-as-Code tools (Terraform, Ansible, CloudFormation).
-
Strong understanding of networking concepts and scripting (Python, PowerShell, Bash).
-
Experience with CI/CD pipelines and Git-based workflows.
Qualifications
- Bachelors degree in Computer Science, Information Technology, or related field with experience in infrastructure or platform operations.
Preferred Certifications
-
AWS Certified Solutions Architect / SysOps Administrator
-
VMware Certified Professional (VCP)
-
Red Hat Certified Engineer (RHCE)
-
Microsoft Windows Server certifications