Website Tata Consultancy Services (TCS)
Experience Range – 5 to 15
Job Location – PAN India
- Identify data elements of the application that have relation with service failure, performance, stability, and scalability issues.
- Deign SRE Roadmap, on board SRE team, perform SRE pilot and support SRE run phase
- Develop metrics, monitoring, and alerting to observe the health of the production system.
- Support Data Scientist in identifying required data elements to predict system outages or service failures
- Identify components of the application susceptible to service failure, performance, stability, & scalability issues.
- Identify automation opportunities and reliability issues
- Be proactive in anticipating production issues – outages, slowness, processing delays, errors, and failures
- Implement SRE Framework and assess tools effectiveness
- Develop or implement visual tools for technical and business teams to observe system health.
implement SRE best practices.
- Facilitate discussion and agreement on EB/ SLI/SLO for applications
- Technical/Solution Architecture Design
- SRE Framework and Reliability Engineering
- Anyone Application and Infra monitoring tool (App Dynamics, Dynatrace, New Relic etc.)
- Experience in AIOPS Implementation, Application Performance, Feature Engineering, Chaos Engineering and Reliability Engineering .
- Understanding and strong knowledge about SRE Framework, reliability engineering, Google SRE practices and building SLO/SLI matrix.
- Any RPA/BI/AI/ML Tools
- Any one container / orchestration platform (Docker, Swarm, Kubernetes etc.)
- Communication Power Messaging and Presentation
Qualification & Experience:
- Tools Effectiveness Assessment and Enterprise Architecture knowledge.
- Experience on Enterprise and non-Enterprise applications/ cloud technologies / APM/ BI tools / RPA tools / AI/ML etc.
Company: Tata Consultancy Services (TCS)
Vacancy Type: Full Time
Job Location: Indore, Madhya Pradesh, IN
Application Deadline: N/A