CMS- Site Reliability Engineer- Senior Associate
PwC
- Bangalore, Karnataka
- Permanent
- Full-time
- Use feedback and reflection to develop self awareness, personal strengths and address development areas.
- Delegate to others to provide stretch opportunities, coaching them to deliver results.
- Demonstrate critical thinking and the ability to bring order to unstructured problems.
- Use a broad range of tools and techniques to extract insights from current industry or sector trends.
- Review your work and that of others for quality, accuracy and relevance.
- Know how and when to use tools available for a given situation and can explain the reasons for this choice.
- Seek and embrace opportunities which give exposure to different situations, environments and perspectives.
- Use straightforward communication, in a structured way, when influencing and connecting with others.
- Able to read situations and modify behavior to build quality relationships.
- Uphold the firm's code of ethics and business conduct.
- Manage IT effort in artificial intelligence and machine learning to simplify IT Operations management and accelerate and automate problem resolution in complex modern IT environments
- System and process automation support of a 24x7 monitoring of the IT environment that provides critical business and engineering infrastructure and application support services
- Selection, integration, and operation of enterprise management tools and application that provides actionable service event notification and communication, to ensure the right information is provided to the right people at the right time
- Selection, integration, automation and operational support of systems and services that enables IT to provide critical business application and infrastructure services to different clients
- Critical Incident Management and communication
- Collaborates and coordinates with all IT infrastructure and application groups to ensure all critical infrastructure components are monitored for availability, performance and capacity, using available tools and techniques
- Ensures all incidents are addressed in a timely manner, using sophisticated notification and escalation system and process
- Responsible for deploying and enhancing system monitoring solutions that will improve the performance and reliability of the managed Infrastructure on prem & in different cloud environments
- Ensure the configuration and accuracy of the monitoring systems for the existing infrastructure (physical servers, virtual machines, network and hardware)
- Work with members of the infrastructure and development teams to gather requirements and design solutions to meet monitoring needs of new and existing products and services
- Develop key business dashboards based on monitoring data
- Implement proactive health checks and internal audits to ensure comprehensive and reliable monitoring coverage
- Collaborate with cross functional stakeholders to deliver monitoring and proactive alerting as part of our standard service delivery
- Bachelor's degree in Computer Science, IT, or equivalent
- Experience and knowledge of monitoring technologies best practices
- 3 to 7 years of experience with any monitoring tools for performance and availability monitoring, alerting, and resolutions.
- Ability to learn quickly and incorporate new knowledge in the rapidly evolving landscape
- Ability to work in a fast paced, multi-site organization
- Experience with: Red Hat Linux, SNMP, Synthetic Transactions, Splunk scripting and analytics, Python and similar scripting languages.
- Certifications in one or more related technologies is preferred.