About the job Infrastructure & Reliability Engineer (LatAm)
We are looking for an Infrastructure & Reliability Engineer with strong cloud and automation expertise to join our team. This role requires an experienced engineer (3+ years) who can maintain scalable infrastructure, ensure system reliability, and take ownership of production environments end to end.
Full-Time | Remote | 8am - 5pm EST
Key Responsibilities
Ensure platform reliability and system health:
- Monitor production infrastructure and respond to incidents.
- Set up monitoring dashboards, alerting systems, and reliability metrics.
- Investigate production issues and perform root cause analysis.
- Participate in incident response and post-mortem processes.
Own integration pipelines and data infrastructure:
- Deploy and maintain ETL pipelines and data synchronization processes.
- Monitor integration health and proactively detect failures or anomalies.
- Troubleshoot data and integration issues across external systems.
- Support scaling of data infrastructure as volumes and integrations grow.
Manage and scale cloud infrastructure:
- Configure and manage cloud environments (AWS or equivalent).
- Implement auto-scaling, load balancing, and high-availability solutions.
- Perform database optimization, maintenance, and capacity planning.
- Manage infrastructure as code using tools like Terraform or CloudFormation.
Build automation and internal tooling:
- Develop scripts and tools to automate operational workflows (Python, Bash).
- Maintain and improve CI/CD pipelines.
- Implement deployment strategies (rolling, blue-green, canary).
- Create internal tools that improve developer productivity.
Support security and compliance:
- Configure access control, IAM policies, and network security.
- Implement encryption and secure secrets management.
- Contribute to compliance processes (e.g., SOC 2) and security audits.
- Maintain documentation for infrastructure and security practices.
Key Requirements
- 3+ years of experience in infrastructure, DevOps, or reliability engineering.
- Strong Linux system administration skills.
- Experience with cloud platforms (AWS preferred).
- Hands-on experience with containers (Docker) and containerized environments.
- Proficiency in scripting (Python and/or Bash).
- Experience with CI/CD tools (GitHub Actions, CircleCI, or similar).
- Solid understanding of monitoring, alerting, and incident response.
- Experience with Infrastructure as Code (Terraform, CloudFormation).
- Strong understanding of scalability, reliability, and performance optimization.
- Ability to work in collaborative engineering environments.
Nice to Have
- Experience in SaaS or fintech environments.
- Knowledge of ETL pipelines and data integrations.
- Experience with Kubernetes or container orchestration.
- Familiarity with security compliance frameworks (SOC 2).
- Experience with database optimization and performance tuning.
- Background in building internal tools or developer platforms.
You Will Thrive Here If You
- Take full ownership of infrastructure and production systems.
- Focus on reliability, not just delivery.
- Automate repetitive work and improve processes continuously.
- Stay calm under pressure and handle incidents effectively.
- Communicate clearly with both technical and non-technical stakeholders.
Why Join Us
High impact: your work directly affects platform reliability and customer trust.
Ownership: full responsibility over infrastructure and system performance.
Growth: opportunity to scale systems in a growing environment.
Autonomy: minimal bureaucracy, strong ownership culture.
Learning: work on complex systems with modern infrastructure practices.
Benefits
- 100% remote
- Competitive salary in USD
- International team and experience