Job Openings
I20 - Software Engineer - SRE (057)
About the job I20 - Software Engineer - SRE (057)
Responsibilities
- Provide operations and project support as a Reliability Engineer, working closely with IT teams and stakeholders.
- Monitor system, application, and infrastructure health through observability and proactive monitoring practices.
- Install, configure, and manage monitoring tools and solutions.
- Implement, enhance, and integrate monitoring solutions to improve operational visibility and business processes.
- Analyse operational data and generate dashboards and reports for insights and reporting purposes.
- Automate day-to-day operational activities using tools such as Ansible, Jenkins, Shell Scripting, PowerShell, and Python.
- Support reliability engineering service requests and operational activities.
- Provide support for production incidents and project cutovers, including after office hours when required.
- Participate in 24x7 operational support when necessary to ensure system availability and stability.
Requirements
- Minimum 3 years of experience in IT operations automation and monitoring solutions.
- At least 2 years of scripting experience, preferably with Ansible, Shell Scripting, Python, or PowerShell.
- Familiarity with platforms and technologies such as Windows, Linux, Unix, AWS Cloud, databases, and middleware.
- Experience with monitoring, observability, and automation tools.
- Strong communication and stakeholder management skills to work effectively with internal and external partners.
- SRE (Site Reliability Engineering) certification or equivalent certification is preferred.