Site Reliability Manager

Crown Agents Bank

  • London
  • Permanent
  • Full-time
  • 24 days ago
  • Apply easily
Company DescriptionCrown Agents Bank is a vastly growing and regulated UK bank that connects emerging and frontier markets to the rest of the world, using FX and payments technology. We are transforming the way payments and FX move through emerging markets, reducing friction so that more money gets to those who need it. Emerging markets payments are usually challenging, expensive, unreliable and opaque. Our solutions help fix these pain points. Ultimately, we connect traditionally hard-to-reach regions to global financial infrastructure, giving access to the best prices and the fastest, most reliable settlement.FX and cross-border payments are often complex and expensive, especially when operating in emerging markets. Crown Agents Bank (CAB) wraps its deep and trusted relationships and strength of network around innovative digital capabilities, and cross-border transaction banking solutions to enable fintech, corporates, governments, development organisations and banks to move money to, from, and across often hard-to-reach markets.Job DescriptionAs part of the team keeping the CAB technology platforms available 24/7/365 this role is responsible for availability(uptime), monitoring, performance, reliability, change management, incident response, and capacity planning for number of core CAB Production services and technologies.
  • Part of the Platform Operations team with heavy focus on platform and system operations in Production
  • Work with the Architecture & Engineering, Product, Application Support, Service Management, Testing and Security teams to uphold good operational practices ensure that appropriate attention is given to production systems reliability from the point of view of our customers.
  • Collaborate with Client Services, Application Support, Product, Engineering and Business Operations teams to ensure that for key services uptime, latency, response time and availability targets are met.
  • Put focus on operations automation, system currency, and simplification to allow CAB to scale its portfolio of services sustainably.
  • Practice and improve incident management processes and provide on-call support.
  • Take ownership of complex Problem Records related to performance, reliability, and scalability and lead the coordination across technology to resolve them, lead the SRE team during major incidents.
  • Build and maintain good understanding CAB products and the platforms on which they are implemented.
  • Build and line manage the SRE team and ensure appropriate technology and skill coverage, manage on-call schedule
Qualifications
  • 5+ years of platform operations engineering, SRE, DevOps or similar relevant experience in a B2B environment
  • Experience and passion configuring and using enterprise grade application performance monitoring tooling such as Dynatrace, DataDog, Prometheus/Grafana, ELK etc.)
  • Experience of deployment, configuration and migration to a cloud providers via IaC, ideally AWS and Terraform, good grasp on multi AZ/ Cross region resilience challenges
  • Innovative and intuitive with a love of collaborative problem-solving.
  • Demonstrable expertise in supporting large-scale, heterogeneous technology stack consisting of mixture of in-house developed monoliths, microservices and serverless functions, batch, external SaaS services, integration technology (MuleSoft Anypoint/ BizTalk)
  • Experience in supporting production messaging and streaming technology such as Kafka or MQ
  • Multiple years of AWS usage (management or hands on)
  • Applies knowledge to tactical and strategic decisions
  • Imparts technical knowledge to other team members
  • Strong understanding of DevOps practices and highly available hosting design
  • Strong Cloud computing interest with experience of IaaS, PaaS and SaaS on multi-cloud, IaC using Terraform and Ansible
  • High level of understanding of Platform Security
  • Have good knowledge of network infrastructure design & public cloud architecture.
Additional Information
  • Hybrid working
  • Contributory personal pension plan: - Minimum: Employee 2% and Employer 7%. Employer matches contributions in 1% increments to a maximum of: Employee 5% and Employer 10%
  • Life Assurance – 4 times annual salary
  • Group Income Protection
  • Private Medical Insurance – this may include cover for partner and or children at company cost. Cover includes Optical, Dental and Audiology
  • Discretionary Bonus
  • Competitive Annual Leave
  • 2 Volunteering Days
  • Benefit Hut

Crown Agents Bank