Our client, a new Silicon Valley-based profitable B2C product startup building innovative mobile solutions for the planet, is now looking for an experienced Site Reliability Engineer to help build reliable, scalable, and observable systems.
Location: Poland
Type: Remote, Full-time
Start date: ASAP
About project and position:
Based in Silicon Valley and backed by top-tier VCs is a new mobile innovator delivering exciting new products for consumers across the planet.
The company has a flagship VPN application with over 1B downloads, ensuring online privacy and anonymity for our users by creating a private network from a public internet connection.
We are looking for a senior-level Site Reliability Engineer who can not only deliver reliable and scalable systems, but also act as a technical leader for the SRE function.
This role goes beyond automation and tooling it requires strong technical depth in Linux, security, and infrastructure automation, as well as the ability to guide others and set standards.
You will shape how we secure, deploy, and operate our infrastructure at scale.
A successful candidate will be both hands-on and a mentor, with the ability to design solutions, troubleshoot complex issues, and influence engineering culture.
Responsibilities:
- Lead the design and improvement of observability solutions (monitoring, logging, alerting, tracing) across services and infrastructure.
- Drive improvements in infrastructure automation using Terraform, Ansible, and beyond (e.g. image verification, package signing, reproducible builds).
- Own Linux systems at scale: tuning, troubleshooting, securing, and educating the team.
- Ensure security best practices in infrastructure (kernel module verification, RBAC/short-lived tokens, commit signing, hardware keys).
- Lead incident response and postmortems, setting standards for reliability and root cause analysis.
- Mentor and coach SRE/DevOps engineers, helping them grow technically.
- Work closely with developers and product teams to improve deployment, scalability, and resilience.
Requirements:
- 7+ years in software/infrastructure engineering, with at least 4+ years in SRE/DevOps roles.
- Strong Linux administration and troubleshooting skills (performance, CPU/memory tuning, debugging).
- Deep understanding of infrastructure security (kernel module signing, image verification, vulnerability scanning, secure automation).
- Solid knowledge of infrastructure-as-code and config management (Terraform, Ansible) and strategies for drift management.
- Experience with cloud providers (AWS, DigitalOcean) and hybrid/self-hosted infrastructure.
- Experience leading incident response and disaster recovery.
- Familiarity with CI/CD pipelines and secure build practices (Git commit signing, reproducible builds, upstream package verification).
- Strong understanding of observability platforms (Prometheus, Grafana, Clickhouse, Datadog, etc.).
- Experience with hardware-based security or strong authentication methods (e.g. HSM, security keys).
- English - fluent spoken and written
Nice to have:
- Experience leading a team or mentoring others in a Tech Lead capacity.
- Knowledge of advanced Linux security features (AppArmor, SELinux, seccomp, kernel hardening).
- Experience with OpenTelemetry or other distributed tracing systems.
- Contributions to open-source or security-focused infrastructure projects.