Site Reliability Engineer - Security Infrastructure
Palantir Technologies
- Washington DC
- Permanent
- Full-time
- Architecting and operating multiple, geographically distributed Kubernetes clusters supporting our mission software. These clusters run in cloud as well as air-gapped environments.
- Operating multi-petabyte, distributed security information and event management systems that handle dozens of terabytes of security telemetry daily.
- Operating security orchestration automation, and response (SOAR) clusters.
- Associated data and telemetry pipelines. We ingest data from hundreds of discrete sources to arm our network defenders. Keeping these data pipelines lean, healthy, secure, and timely is germane to our detection and investigation workflows.
- Work closely with Infrastructure team to maintain, operate, and evolve multiple highly performant, large-scale Kubernetes and application clusters.
- Collaborate with other InfoSec teams to ensure data and telemetry collected is accurate, actionable, and provides significant security value.
- Develop, deploy, and monitor data pipelines to provide timely, complete, and accurate security data for network defenders.
- Directly support detection and investigation workflows through query development, dashboard creation, training, and new capability development.
- Ingest, enrich, transform, and analyze data in Palantir Foundry to provide meaningful security insights and improvements.
- A highly analytical mindset and eagerness to solve technical problems with distributed computing, code development, data pipelining tools, data health and monitoring frameworks, and other technologies.
- Ability to independently own projects and balance competing priorities, whilst still effectively collaborating with colleagues.
- Experience with public cloud service providers (e.g. Amazon AWS, Google GCP, Microsoft Azure) and modern deployment technologies (e.g. CI/CD, Kubernetes, docker).
- Proficiency in a modern scripting or programming language such as Python (preferred), or similar scripting languages (i.e., shell, etc.).
- 5+ years of infrastructure engineering experience. Plus to multi-cloud / Kubernetes operational experience at scale.
- 3+ years extensive security experience running, administering, or operating a complex Splunk Enterprise cluster.
- Active US Security clearance, or eligibility and willingness to obtain a US Security clearance.