Senior Site Reliability Engineer
Epsilon
- Dallas, TX
- Permanent
- Full-time
- Experience in developing and maintaining a deep technical understanding of systems, product lines and technologies.
- A successful candidate will be expected to further DevOps strategies and skillsets within the team in support of stakeholder application and infrastructure enablement.
- Exposure in working in a large, complex technology initiatives of strategic importance to the organization involving cross-functional teams.
- Must be able to integrate quickly into the team and work independently toward team goals.
- Proven track record in developing automation workflows against cloud services and cloud native environments.
- You have experience with observability, logging, metrics, monitoring services, and making the data usable and meaningful to improve operational health & efficiency
- As part of the Epsilon Service Delivery Team, the pace of the work matches the fast-evolving demands of Fortune 500 clients across the globe
- As part of an innovative team that's not afraid to take risks, your ideas will come to life in IT process automation that enhances and strengthens internal tooling.
- The open and transparent environment that values innovation and efficiency.
- Work closely with internal stakeholders to understand their requirements and deliver best practice design and guidance in support of IT process automation that enhances and strengthens internal tooling while evangelizing new use cases among existing internal customers and stakeholders.
- Fulfil the responsibilities of a DevOps and automation engineer working on cloud-native technologies.
- Research the collection, parsing, and analysis of infrastructure data from various devices or services while developing/enhancing tool/service automation in support of greater data accuracy and consistency.
- Evangelize and develop DevOps strategies among team members to further team and process evolution supporting hybrid cloud business objectives.
- Develop and deliver custom, complex configuration and training to mature internal customers who want to better leverage our internal IT tools and services.
- Work closely with Product Management and Engineering teams to improve Epsilon's experience with our internal IT tools and services.
- Work with the IT Service Management leadership, IT service owners, process owners and various service delivery groups to develop business intelligence solutions that satisfy the business needs of the Shared Technology Services organization and other departments seeking to use ITOM and ITSM tools and data.
- Partner with multiple cross-functional departments and teams including Support and Operations to help meet business outcomes.
- Automate and apply development principles to operational practices for overall SaaS Service management.
- Can develop scripts or code leveraging REST APIs, preferably with Python or PowerShell
- Administrative experience with AWS, Azure, or Google Cloud services
- Experience with Infrastructure-as-Code such as Terraform or CloudFormation
- Experience with other automation like Ansible and CICD tools
- Systems administrator experience with Linux or Windows operating systems
- Experience with event logs, metrics, and the services needed to collect & use that data
- Can assess, recommend, and optimize manual steps and replace with automation
- Can document and articulate with clarity solutions, patterns, and processes
- Working knowledge of DevOps tooling like Bitbucket, Jira, Terraform a plus
- Our Culture: https://www.epsilon.com/us/about-us/our-culture-epsilon
- Life at Epsilon: https://www.epsilon.com/us/about-us/epic-blog
- DE&I: https://www.epsilon.com/us/about-us/diversity-equity-inclusion
- CSR: https://www.epsilon.com/us/about-us/corporate-social-responsibility