Define and drive the vision for resilience engineering at Affirm, with a focus on production load testing and chaos engineering as first‑class engineering practices.
Lead and mentor a team of engineers building platforms and tooling for safe production experimentation.
Partner with infrastructure, product, and security leadership to embed resilience validation into the software development lifecycle.
Establish best practices for safely testing system limits and failure scenarios in production.
Systems & Operations
Own the design and evolution of platforms that enable safe, controlled production load testing and fault injection.
Ensure strong safeguards are in place, including isolation boundaries, approval workflows, and automated rollback mechanisms to protect real users.