The SRE will focus on ensuring reliable, resilient systems through task automation, observability, incident response, and problem elimination, while also participating in production-side operations and on-call rotations.
KEY RESPONSIBILITIES
Deliver improvements to maximize system availability and performance through optimized and automated operational tasks.
Collaborate on the development of operational tools, problem management, and architecture reviews.
Troubleshoot ServiceNow issues and occasional on-premise capabilities in a Linux environment.
Explore and implement observability practices including metrics, logging, tracing, and alerting to measure product reliability.
Participate in on-call rotation with global team members, ensuring responsiveness during agreed hours.
Contribute to documentation of ServiceNow instances and related dependencies.