Staff SRE Engineer

Circles.Life

India
Permanent
Full-time

25 days ago

Job Title: Staff Software Engineer, DevSecOpsRole: Staff SRE EngineerLocation: Bangalore, IndiaAbout UsFounded in 2014, Circles is a global technology company reimagining the telco industry with its SaaS platform - Circles X, helping telco operators launch and operate successful digital brands through its offerings.Having pioneered a successful blueprint for disrupting the telco space in Singapore, Circles has since launched its own digital telco, Circles.Life, in Singapore, Taiwan and Australia. Circles has also partnered with other telco operators to launch digital services, enabling our partners to accelerate growth and capture market share within a short period of time.Today, Circles is partnering with operators in 14 countries to deliver delightful digital experiences to millions of people through our businesses.We are backed by global investors such as Sequoia, Warburg Pincus, EDBI and Founders Fund - renowned backers of industry-shaking innovators.Role DescriptionSite Reliability Engineering (SRE) is a horizontal function spanning across the entire company meaning you'll be able to work with multiple teams across various products and platforms to ensure their software features are reliable for the build and launch & Operational teams. As a Staff SRE Engineer you will work as an individual contributor (IC) who has hands-on experience to propose, design, implement and troubleshoot robust CICD pipeline for multi-tier applications deployed on virtual machines and containerize platforms running on top of public cloud infrastructure.You will be helping the organization to embark on a continuous delivery and deployment journey using cutting edge technologies for release orchestration. You will help the organization accelerate the SaaS journey by implementing robust automation, strategic solutions and reusable architecture patterns with loosely coupled architecture.This is a fantastic opportunity to join a talented team where you'll be able to add real value to our organization, shape the SRE/DevOps and upskill your technical and business skill set.Your Responsibilities

Owning Infra architecture and non-functional requirements, ensuring they fit into a cohesive vision aligned with the rest of the Technology roadmap of the platform for launch.
Propagate Site Reliability Engineering culture across the organization by sharing industry best practices, standards, approaches, documentation, and code with other engineering teams
Design, test and troubleshoot CICD pipeline for containerized applications from build until deployment.
Setup a continuous delivery and deployment pipeline integrated with release workflow to support release orchestration.
Troubleshoot multi-layer and containerized applications deployed in cloud infrastructure.
Apply automation and software to any manual and mechanical tasks or parts of the system that would benefit from it or are performed manually.
Able to troubleshoot complicated, cross platform issues handling OS, Networking, Database in a cloud-based SaaS environment and handle live production incidents, debug/troubleshoot application and infrastructure issues, follow and implement SRE best practices.
Conduct system discovery, analysis, and develop improvements for system software performance, availability and reliability.
Design, write, ship, and motivate the implement solutions to increase observability, product reliability and organizational efficiency.
Collaborate closely with software engineers and testers to ensure the system is responding properly to no-functional requirements such as performance, security, and availability.
Document system knowledge as you acquire it over time, create runbooks, and ensure critical system information is readily available to those who need it.
Maintain and monitoring deployment, orchestration, of the servers, docker containers, kubernetes, and general backend infrastructure.
Keep up-to date with security and proactively identify, diagnose, and solve complex security issues.
Participate in On-Call roster to provide weekend support.

Required Technical Skills

10+ years of working experience in infrastructure support and CICD platform, leveraging DevOps, SRE & Agile methodologies.
5+ years experience designing, testing and implementing CICD pipeline to automate build, deployment and code promotion.
5+ years of experience in writing automation scripts, CICD pipeline and automated routine tasks using groovy / python to eliminate human dependencies.
Prior experience in troubleshooting CICD pipeline issues for containerized and multi layer applications deployed in GCP or AWS.
Sound knowledge to dive deep to understand the problem statement and execute structured troubleshooting mechanisms to identify the root cause and apply strategic solutions.
Experience with CI/CD in cloud environments and container technology, Docker and Kubernetes, Docker Swarm, Helm DevOps (Git + CI/CD pipelines)
Experience as Linux systems administrator (e.g. Ubuntu, RedHat) and command line system administration such as Bash, VIM, SSH.
Experience in monitoring and analyzing infrastructure performance using standard performance monitoring tools - Grafana/prometheus, DataDog, Nagios, New Relic
Extended expertise in infrastructure core components: storage, system and/or networking

Required Business Skills

Adaptable to change and able to work independently with one team attitude.
Ability to communicate clearly and with clarity to different stakeholders.
Strong presentation skills to prepare powerpoint presentations and architecture diagrams.
Capable of delivering multiple initiatives concurrently while maintaining a high level of attention to detail.
Manage and prioritize work effectively with minimal supervision.
Provide timely and relevant stakeholder update, project status and vital data points.
Ability to learn new technologies as needed to provide the best solutions.
Strong problem analysis skills to dive deep to understand root cause, provide strategic / interim solutions.
Sound analytical skills to come up with supporting data points.
Solid mathematical skills to enforce programmatic results validation.

Additional Technical Skills (nice to have)

Understanding of TCP/IP networking, including familiarity with concepts such as OSI stack.
Understanding of Internet protocols and applications such as SMTP, DNS, HTTP, SSH, SNMP etc.
Understanding of ELK, Redis, RabbitMQ, Kafka and ETCD.
Hands-on experience in writing infrastructure as code (IaC), configuration management as code (CMaC) and policy as code (PoaC) is a plus

Certification Requirements (nice to have)

Kubernetes CKA or CKAD certification is nice to have
AWS or GCP DevOps related certifications is nice to have
GCP or AWS certification on cloud architecture - associate/professional is nice to have

To all recruitment agencies: Circles will only acknowledge resumes shared by recruitment agencies if selected in our preferred supplier partnership program.Please do not forward resumes to our jobs alias, Circles employees or any other company location. Circles will not be held accountable for any fees related to unsolicited resumes not uploaded via our ATS.Circles is committed to a diverse and inclusive workplace. We are an equal opportunity employer and do not discriminate on the basis of race, national origin, gender, disability or age.

Circles.Life

Apply Now