Job Openings
Site Reliability Engineer (DevOps/Linux)
About the job Site Reliability Engineer (DevOps/Linux)
Responsibilities:
- Ensure high availability and performance of systems
- Analyze performance metrics and resolve incidents (P0P3)
- Involve in system design and set reliability goals
- Continuously optimize and innovate for better user experience
- Improve and maintain the full lifecycle of services: development to deployment
- Observability, monitoring, and troubleshooting of distributed cloud systems
- Proficient in debugging and automating tasks in OS, networking, databases, and applications
Requirements:
- Programming in Java, Python, or Go, Scripting with Shell, Terraform, Ansible, Chef, or Puppet
- Strong understanding of Linux/Unix, containers, VMs, and cloud platforms
- Experience with DevOps processes, Automation using SaltStack, Spinnaker, or StackStorm
- Experience with big data, chaos engineering, auto-scaling container platforms
- Background in data science, cybersecurity (SIEM, threat modeling)
- Performance tuning for cloud networks, middleware, RDBMS, NoSQL, etc.
- Bachelor's or higher in Computer Science or Electronics & Communication
- Strong analytical and communication skills. Quick adaptability and problem-solving abilities
- Passion for continuous learning and staying updated with tech trends
Notes: Malaysia Roles : 1-10 years of relevant experience
India Roles : 5+ years of relevant experience , WFH- EU shift