On-site
PUNE
India
3-6 months
Time and material
$ 18-20/Hr
Description
RTH-Y Job Title: Kubernetes Platform Engineer – SRE - Provide L1-L3 level support for Payments services deployed on IKP clusters. - Monitor platform health using Datadog, Splunk, Prometheus, and Grafana, act on alerts to maintain SLOs and contribute to the reduction of overall alerts. - Troubleshoot and resolve production incidents, ensuring timely communication and documentation. - Participate in on-call rotations and provide support during planned upgrades or regulatory releases. - Develop andimplement automation scripts and self-healing solutions to reduce incident recurrence. - Support CI/CD pipelines and enhance deployment reliability. - Maintain up-to-date runbooks, incident retrospectives, and SOPs. Required Skills - Strong hands-on experience with Kubernetes (preferably GKE/Anthos). - Experience with monitoring/observability tools: Datadog, Splunk, Prometheus, Grafana. - Strong understanding of incident management processes, RCA, and SLO-based alerting. - Ability to handle on-call duties and perform in high-pressure production environments. - GCP certification is preferred. - Experience working in regulated environments e.g. Banking or Financial Services is preferred. - U2XJQJ
Skills:
Grafana,on-call duties,Datadog,Splunk,Prometheus,Kubernetes,incident management,CI/CD

Interested in this project and numerous others like it?

Register on WorkWall now and get started