Service & Infrastructure Management: Oversee and manage core platform web services, including API and database servers to ensure optimal performance and health.
System Monitoring & Emergency Response: Proactively monitor application and infrastructure health using tools like Grafana, ELK, and Sentry.
Participate in a compensated 24/7 on-call rotation that is professionally managed and structured for fairness, conducted virtually (no need to be on-site). You will be backed up by a senior engineer for immediate support, troubleshooting, and swift emergency resolution.
Automate recurring operational tasks, system deployments, backups, and maintenance procedures to improve efficiency.
Partner with the Software Development team to provide guidance and embed modern DevOps practices direct...