Job Openings Site Reliability Engineer (RCV Team)

About the job Site Reliability Engineer (RCV Team)

UpSkill is a recruitment agency ready to go the extra mile in order to help candidates find the best possible job opportunity. Our team of experts is well-versed and experienced in consulting and providing long-term HR support.

We believe that being friendly is the best policy, that's why we are eager to help you through the whole lifecycle of recruitment. Our team comes with 15 years of recruitment experience behind its back. At any given moment, we can offer multiple opportunities from different companies in need of a wide variety of talent.

If you are interested in starting a new job, we will present you with multiple opportunities, will be there to answer all your questions, help you prepare for interviews and tests, provide essential feedback and even guide and support you through the process of recruitment all the way up to the first day at your new job.

Our current client is a $2 billion global leader in cloud-based communications and collaboration software.

On their behalf we are looking for a Site Reliability Engineer. 

Responsibilities:

      • Manage geo-distributed cloud infrastructure on AWS and EKS, using IaC (Terraform) and GitOps (FluxCD) to ensure scalability;
      • Participate in 2 weeks on for 12h/daily (primary/backup roles), 3 weeks off on-call shifts to ensure continuous production support and timely response to operational needs;
      • Participate in service capacity planning, software performance analysis, and system configuration;
      • Design, consult, re-platform, and re-factor observability of current cloud infrastructure (Prometheus, Grafana, VictoriaMetrics, centralized logging and alerting);
      • Participate in release management, working closely with development teams to implement GitOps principles in release processes and manage CI/CD pipelines using GitLab CI;
      • Conduct blameless post-mortems to learn from incidents and prevent them;
      • Develop and test disaster recovery plans and runbooks to ensure business continuity;
      • Implement security best practices and controls in the infrastructure to meet compliance standards and prepare for audits.

      Requirements:

          • Cloud & Infrastructure: AWS production environments - read and write Terraform manifests, understand IaC principles;
          • Kubernetes: Manage Kubernetes clusters - troubleshoot pod failures, set resource limits, work with scaling, understand networking;
          • CI/CD: Create and maintain CI/CD pipelines (GitLab CI is preferable);
          • Observability: Manage monitoring stacks (Prometheus, Grafana) - write PromQL queries, create dashboards, configure effective alerts;
          • Troubleshooting: Debug performance issues in distributed systems - analyze network traces, read application logs for root cause analysis;
          • Performance: Identify and eliminate bottlenecks - interpret metrics, optimize resource allocation and costs;
          • Incident Management: Participate in incident response - quickly localize problems, coordinate with other teams through war rooms/incident channels, document event timelines. 

          The company offers:

              • Well-coordinated professional team.
              • Cutting edge technologies, interesting and challenging tasks, dynamic project, great opportunities for self-realization, professional and career growth.
              • Additional Health and Life Insurance Package.
              • Employee Assistance Program.
              • 25 vacation days.
              • 102,26 EUR/200 BGN Digital Food Vouchers.
              • 61, 36 EUR/120 BGN Gross as part of the salary for Working Expenses Allowance.

              If you meet the above-mentioned criteria, don't hesitate to apply!

              We welcome the opportunity to learn more about you!

              Please send your CV in English.

              Please note that only short-listed candidates will be contacted

              License 2826. We will treat your application with full confidentiality!