About the job Monitoring Platform Integrator
Job Description
The Key responsibilities include:
· Help design, build, deploy and configure the new monitoring infrastructure that will enable us to work faster and smarter.
· Work with tech-leads of system migrations to ensure they correctly monitor their new platform and help in the creation of alerting rules and escalation paths
· Ensure that the monitoring system itself if ‘monitored’ and there are redundant escalation paths to detect if parts have failed.
· Develop and maintain any code-base required to solve solutions and customer specific config
· Ensure the platform is configured as automatically as possible using technologies like service discovery, ansible, git to reduce manual configuration where possible
· Help tech-leads and system owners build Grafana and other dashboarding tools
· Work with our NOC teams and system owners to gather requirements for monitoring and alerting and ensure these critical functions are maintained during system transitions.
· Help transition custom monitoring scripts from Nagios to either Prometheus or icinga2 platforms.
· Integrate existing monitoring systems into the new design and help transition away systems as require
Qualification:
Basic degree or diploma in IT. Certifications from Microsoft and on Enterprise Linux, Cloud Foundations, AWS Cloud Practitioner or similar, Dev Ops centered training and quals
Experience
>5 years of experience in a systems admin role implementing, developing and maintaining enterprise level platforms preferably in the media industry
In-depth knowledge in the design and implementation of the following areas is crucial
· Management of Docker and/or Kubernetes Platforms
· Docker container build processes
· Redhat/Oracle Linux/CentOS System Administration
· AWS Cloud toolsets
· Monitoring technologies: Prometheus, influx dB, icinga2, Nagios, SNMP, Grafana
· Logging technologies: Kibana, Elasticsearch, Cloud Watch
· Orchestration management with a focus on one of the following: Ansible, Cloud Formation, Terraform, Puppet, Chef
· Python development
· JSON and API integration
· NetBox
Knowledge in the following may be advantageous:
· GO