Hong Kong, Hong Kong SAR, Hong Kong
Lead SRE (Observability & Automation)
Job Description:
A typical day in this Role:
- Design, implement & own end-to-end observability solutions using tools to ensure
- comprehensive system visibility to improve reliability, architect highly resilience systems.
- Advocate for observability best practices across engineering teams and integrate monitoring
into Infrastructure & applications.
- Develop automation for infrastructure to reduce manual toil, ensure reliability and optimize resource utilization through performance analysis, AI abnormally detection and dynamic adjustments.
- Mentor observability team and foster a culture of continuous improvement and innovation.
- Work with technical partners, exploring tools/features PoC, manage licenses, and conducting training sessions.
This job is a good fit for You if:
- You are a PROBLEM SOLVER. You make decisions based on evidence-based opinions.
- You are a CHANGE CHAMPION. You love imagining what could be and don't hesitate to challenge the status quo. You are good at producing original ideas and are very comfortable with ambiguity.
- You are a COMMUNICATOR. You have an ability to pick up on people's underlying motivations and these insights makes you persuasive and inspiring.
- You are an EXPERT. You have in-depth knowledge of a key area and seek possible solutions through study and research.
Success will depend on:
- Solid working experience in SRE, DevOps, or systems architecture roles, with proven success in project deployments rollout.
- Hands-on experience with observability tools (e.g., Dynatrace, Prometheus, Grafana, ELK Stack etc.) and automation frameworks (e.g. Ansible, Jenkins).
- Scripting/programming skills for automation and tool development.
- Knowledgeable on AI/ML-driven observability for predictive analytics and anomaly detection
- Problem-solving skills and a data-driven mindset. Communication skills to bridge technical and non-technical stakeholders .
- Good command in spoken and written Cantonese and English.
Required Skills:
Automation