
Technical Senior Manager of Site Reliability Engineering
- USA
- $94,000-163,000 per year
- Permanent
- Full-time
- Allocate approximately 70% of time to hands-on engineering tasks, such as developing new deployments, tooling, and automation scripts to address client needs
- Dedicate around 30% of time to leadership duties, including mentoring junior engineers, ensuring quality deliverables, and managing escalations
- Act as the primary escalation contact for complex technical issues, resolving them promptly to maintain high levels of client satisfaction
- Monitor and uphold quality standards for engineering work, confirming alignment with internal protocols, compliance regulations, and project milestones
- Identify and mitigate risks in partnership with consulting and solutions architecture teams, ensuring regulatory requirements and client expectations are fully addressed
- Coordinate day-to-day engineering activities, tracking progress and adjusting resources to meet project goals on schedule
- Help create and implement solutions that improve the practice
- 9+ years in Systems Engineering and Architecture: Involving requirements definition, architecture development, systems integration, and testing.
- 9+ years in Cloud Computing: Designing, implementing, operating, and automating environments within AWS, Azure, or GCP
- 9+ years with Infrastructure-as-Code: Hands-on proficiency in Terraform and Ansible for orchestration and automation
- SLA and Issue Management: Proven track record of meeting SLAs-particularly regarding availability, response times, and service posture-through effective collaboration and escalation processes
- Operational Excellence: Demonstrated success driving continuous improvement via KPIs and best practices for operational support
- Governance and Compliance: Experience guiding the creation of Infrastructure-as-Code solutions, governance models, and alignment with standards such as FedRAMP or other security frameworks
- Team Leadership: Proven track record of managing teams (6-8 contributors), focusing on career development, goal setting, project oversight, and daily guidance
- Regulatory Audit Prep: Prepared and coached teams for client-facing compliance audits with third-party auditors
- Project Definition and Documentation: Lead efforts of defining, planning, and documenting key Managed Services projects and initiatives; tracked outcomes against established goals
- Managed Services Expertise: Familiarity with ticket management systems and meeting SLA requirements in a managed services environment
- Cloud & Automation: Extensive experience with AWS, Azure, or GCP; deep knowledge of Terraform, Ansible, GitLab, and CI/CD technologies
- Technical Collaboration: Proven ability to collaborate with Site Reliability Engineers and cross-functional teams, facilitating team problem-solving and performance improvements
- Soft Skills: Strong interpersonal, organizational, and problem-solving skills; effective at building client trust
- Documentation & Communication: Capable of creating technical diagrams and comprehensive written documentation; able to convey complex ideas clearly
- Professionalism & Autonomy: Demonstrated ability to work both independently and as part of a team with a professional attitude and demeanor
- Security Mindset: Critical thinker capable of balancing stringent security and compliance requirements with mission objectives
- Consulting Experience: Previous roles in technical consulting for external clients
- High-Availability Environments: Exposure to 24x7 operational settings or large-scale and high-availability system support
- Encryption and Hardening: Demonstrated expertise implementing SSL, PKI, FIPS 140-2, and enforcing security baselines such as CIS Benchmarks and DISA STIG
- Further Cloud and Security Specialization: Additional hands-on work with container orchestration (Kubernetes), advanced threat detection, or enterprise endpoint security