About the job Senior Reliability Engineer (BE focused)
About hireworks
hireworks is building a community of top talent in key international markets by unlocking unparalleled access to positions at leading U.S. based companies. As your employer, hireworks will ensure you have a seamless interview, onboarding, and employee experience - providing ongoing support and resources along the way. Established in 2023, hireworks is forging corp-to-corp relationships with leading U.S. based organizations looking to grow their teams with best-in-class talent around the world. Working with hireworks means unlocking access to a network of local peers and mentors and career opportunities through our client network.
About our client
This technology firm specializes in enhancing client retention and fostering modern loyalty programs directly within the transaction process. It is built around a proprietary, branded digital savings solution that allows businesses to manage their own financial flows for sales and returns. The core platform enables companies to bypass conventional payment processors, strengthen customer rewards, and gain greater oversight of their money management. Optimize your operational efficiency and cultivate a more resilient, engaged customer following with this specialized system.About the Position
We're looking for a senior backend engineer who builds reliability through elegant, production-ready code architecture. You'll have significant authority to rearchitect critical systems, replacing homegrown solutions with industry-standard tooling and patterns that handle 10k+ req/sec at scale.
This is primarily a programming role focused on building robust, observable systems through code. You'll spend most of your time architecting and implementing reliability improvements, not managing infrastructure.
What makes this role unique:
Architectural Authority: Drive decisions on adopting technologies like Temporal.io for durable execution vs. maintaining custom retry logic
Production Scale: Design systems that handle high-throughput payment and loyalty processing with strict SLA requirements
Code-First Reliability: Improve system reliability by writing better application code, not just adding monitoring
Industry Standards Over NIH: Replace internal implementations with proven, production-ready solutions
What You’ll Do
System Architecture & Reliability Engineering
Rearchitect core reliability patterns: Replace custom retry mechanisms with durable execution engines like Temporal.io
Implement robust event processing: Migrate direct webhook handling to reliable delivery systems like Hookdeck with proper delivery semantics
Build behavioral monitoring: Integrate time-series databases to detect and alert on changing system behavioral patterns
Eliminate technical debt: Systematically replace "not invented here" solutions with industry-standard, battle-tested alternatives
High-Scale Backend Development
Design and implement systems that maintain performance and reliability at 10k+ requests/second
Write production-grade code for payment processing, wallet operations, and loyalty program mechanics
Build comprehensive error handling, circuit breakers, and graceful degradation patterns
Implement distributed system patterns for fault tolerance and observability
Production Excellence
Instrument deep observability into application code using existing frameworks (Datadog)
Design monitoring that provides actionable insights into system behavior and business metrics
Build alerting that proactively identifies reliability issues before they impact users
Lead incident response with focus on permanent architectural fixes rather than band-aid solutions
Technical Leadership
Evaluate and recommend new technologies and architectural patterns for production readiness
Collaborate with product engineering teams to embed reliability patterns into new feature development
Drive technical decisions around system architecture, scaling, and reliability patterns
Mentor engineers on production best practices and reliable system design
About You
Required
5+ years backend engineering experience building high-throughput, production systems (10k+ req/sec)
Strong programming skills in modern languages - our stack uses TypeScript, but we value polyglot engineers
Production architecture experience with distributed systems, microservices, and reliability patterns
Systems thinking: Ability to identify when to build vs. buy vs. adopt existing solutions
Cloud-native development with AWS services (ECS, RDS, ELB) and modern deployment patterns
Technical leadership: Experience making architectural decisions and driving technical improvements independently
Highly Valued
Experience with durable execution systems (Temporal.io, Step Functions, etc.)
Background in fintech, payments, or high-reliability systems
Knowledge of event-driven architectures and reliable message processing
Experience with time-series databases and behavioral analytics
Track record replacing legacy systems with modern, scalable alternatives
Startup or high-growth experience where you've scaled systems through rapid growth
Tech Stack & Scale
Backend: TypeScript/Node.js, REST APIs, high-throughput transaction processing
Infrastructure: AWS (ECS, RDS, ELB), Cloudflare
Observability: Datadog (existing), custom instrumentation and analytics
Scale: 10k+ requests/second, real-time payment and loyalty processing
Architecture: Distributed microservices, event-driven systems
Types of Challenges You'd Tackle
Identifying and replacing fragile custom implementations with industry-standard solutions
Architecting reliable event processing where current approaches show brittleness
Building proactive monitoring for behavioral changes that current systems miss
Designing fault-tolerant patterns for high-throughput transaction processing
Evaluating and implementing durable execution patterns for complex workflow reliability
Creating robust delivery semantics for webhook and event-driven architectures
Benefits:
hireworks is cultivating a growing community of top talent across Colombia. In addition to unlocking access to positions at top tier U.S. based companies, we offer a variety of benefits to enhance your experience:
Competitive Pay - compensation that reflects your experience and accomplishments
Remote Flexibility - work from anywhere in Colombia
Paid Time Off - ample vacation days to rest and recharge
Public Holidays - all Colombian federal holidays are fully paid days off