Infrastructure & Reliability Engineer (LatAm)

Job Openings Infrastructure & Reliability Engineer (LatAm)

About the job Infrastructure & Reliability Engineer (LatAm)

We are looking for an Infrastructure & Reliability Engineer with strong cloud and automation expertise to join our team. This role requires an experienced engineer (3+ years) who can maintain scalable infrastructure, ensure system reliability, and take ownership of production environments end to end.

Full-Time | Remote | 8am - 5pm EST

Key Responsibilities

Ensure platform reliability and system health:

Monitor production infrastructure and respond to incidents.
Set up monitoring dashboards, alerting systems, and reliability metrics.
Investigate production issues and perform root cause analysis.
Participate in incident response and post-mortem processes.

Own integration pipelines and data infrastructure:

Deploy and maintain ETL pipelines and data synchronization processes.
Monitor integration health and proactively detect failures or anomalies.
Troubleshoot data and integration issues across external systems.
Support scaling of data infrastructure as volumes and integrations grow.

Manage and scale cloud infrastructure:

Configure and manage cloud environments (AWS or equivalent).
Implement auto-scaling, load balancing, and high-availability solutions.
Perform database optimization, maintenance, and capacity planning.
Manage infrastructure as code using tools like Terraform or CloudFormation.

Build automation and internal tooling:

Develop scripts and tools to automate operational workflows (Python, Bash).
Maintain and improve CI/CD pipelines.
Implement deployment strategies (rolling, blue-green, canary).
Create internal tools that improve developer productivity.

Support security and compliance:

Configure access control, IAM policies, and network security.
Implement encryption and secure secrets management.
Contribute to compliance processes (e.g., SOC 2) and security audits.
Maintain documentation for infrastructure and security practices.

Key Requirements

3+ years of experience in infrastructure, DevOps, or reliability engineering.
Strong Linux system administration skills.
Experience with cloud platforms (AWS preferred).
Hands-on experience with containers (Docker) and containerized environments.
Proficiency in scripting (Python and/or Bash).
Experience with CI/CD tools (GitHub Actions, CircleCI, or similar).
Solid understanding of monitoring, alerting, and incident response.
Experience with Infrastructure as Code (Terraform, CloudFormation).
Strong understanding of scalability, reliability, and performance optimization.
Ability to work in collaborative engineering environments.

Nice to Have

Experience in SaaS or fintech environments.
Knowledge of ETL pipelines and data integrations.
Experience with Kubernetes or container orchestration.
Familiarity with security compliance frameworks (SOC 2).
Experience with database optimization and performance tuning.
Background in building internal tools or developer platforms.

You Will Thrive Here If You

Take full ownership of infrastructure and production systems.
Focus on reliability, not just delivery.
Automate repetitive work and improve processes continuously.
Stay calm under pressure and handle incidents effectively.
Communicate clearly with both technical and non-technical stakeholders.

Why Join Us

High impact: your work directly affects platform reliability and customer trust.

Ownership: full responsibility over infrastructure and system performance.

Growth: opportunity to scale systems in a growing environment.

Autonomy: minimal bureaucracy, strong ownership culture.

Learning: work on complex systems with modern infrastructure practices.

Benefits

100% remote
Competitive salary in USD
International team and experience

Or refer someone