Shape the future of operational excellence as an SRE & Resiliency Engineer with Intact. This hybrid role focuses on deploying advanced tooling and enhancing production reliability across cloud environments.
As part of the Intelligent Operations Department, you'll manage high-severity incidents, drive continuous improvement, and enforce error budgets. Your expertise in observability will be crucial for implementing resilient systems and guiding teams towards effective incident management strategies.
Key Responsibilities: • Lead investigations by collaborating with incident management teams • Implement observability solutions to enhance system health • Architect auto-healing solutions and reliability policies • Craft detailed reliability reports for continuous performance improvement • Coach incident teams to foster a culture of resilience
Requirements: • 8+ years of SRE or Infrastructure Engineering experience