Job Openings
Helpdesk Support Engineer
About the job Helpdesk Support Engineer
Key Responsibilities:
Incident & Application Support
- Provide second-line (L2) support for production and staging systems, handling escalations from L1 Support.
- Investigate application errors, system alerts, performance degradation, and integration issues.
- Restore services within agreed SLA/OLA timelines and ensure proper incident closure.
Troubleshooting & Root Cause Analysis
- Perform in-depth troubleshooting using logs, metrics, and monitoring tools.
- Conduct root cause analysis (RCA) for recurring or high-impact incidents.
- Propose and implement corrective and preventive actions to reduce incident recurrence.
Collaboration & Escalation
- Work closely with L3 engineers, DevOps, and vendors to resolve complex technical issues.
- Provide clear technical findings, logs, and evidence when escalating issues.
- Participate in incident bridges, post-incident reviews, and operational discussions
Operational Excellence
- Monitor system health, alerts, dashboards, and logs to proactively identify issues.
- Execute approved configuration changes, patches, and operational fixes.
- Support deployment, release, and maintenance activities when required.
Automation & Continuous Improvement
- Contribute to automation of operational tasks, monitoring, and alerting where applicable.
- Identify gaps in runbooks, SOPs, and operational processes and drive improvements.
Documentation
- Maintain and update runbooks, troubleshooting guides, and knowledge base articles.
- Document incident resolutions and operational procedures clearly and accurately.
Security & Compliance
- Adhere to security, access control, and compliance requirements.
- Handle sensitive information in logs, tickets, and systems appropriately.
- Support audits, vulnerability remediation, and compliance checks when required.
Key Experiences and Qualifications:
Educational Background:
- Diploma or higher in Computer Science, Information Technology, or a related field.
Professional Experience:
- 3–5+ years of relevant experience in application support, systems support, or operations roles.
- Experience supporting production systems in a high-availability or mission-critical environment
Technical Expertise:
- Strong hands-on experience with: Application log analysis and monitoring tools (e.g. AWS CloudWatch, Grafana, ELK, Google Analytics, etc) Linux/Unix environments
- Working knowledge of cloud platforms (e.g. AWS services such as ECS, Lambda, S3, RDS).
- Basic database knowledge (MySQL, PostgreSQL) for health checks and simple queries.
- Basic knowledge on REST APIs, system integrations and authentication design
- Understanding of incident, problem, and change management processes.
Problem-Solving Skills:
- Strong analytical and troubleshooting skills.
- Ability to break down complex incidents into clear, actionable steps.
- Calm and methodical approach when handling production issues under pressure.
Operational Practices:
- Familiarity with ticketing and incident management tools (e.g. Jira, PagerDuty).
- Experience working with runbooks, SOPs, and on-call support rotations (if applicable).
Additional Skills
- Experience supporting cloud-native or microservices-based systems.
- Basic scripting skills (e.g. Bash, Python) for automation.
- Experience working in government, regulated, or large-scale enterprise environments.
- Knowledge of disaster recovery and business continuity planning.
Character Traits We Look Out For
- Team player with a collaborative mindset Strong sense of ownership and accountability for system reliability
- Proactive in identifying and addressing operational issues
- Willingness and ability to learn and adapt to new systems and tools
- Openness to sharing knowledge and improving team capability
- Clear verbal and written communication skills, including incident reporting