Jobtitel:	75% remote: Site Reliability Engineer (f/m/d) focus CD Operations in a private cloud
Zahlungsintervall:	Stündlich
Lohnsatz:	Verhandelbar
Ort:	Frankfurt am Main, Remote
Job veröffentlicht:	07-05-2026
Job-ID:	74207
Name:	Niklas Machens
Telefonnummer:	+4915119501867
E-Mail:	niklas.machens@nemensis.de

Stellenbeschreibung

For our client we are looking for a SRE (f/m/d).
 
Start: 01.06.2026
Duration: 31.12.2026++
Capacity: 100%
Location: 75% Remote, 25% Frankfurt (occasionally, sometimes Berlin)
1 week Frankfurt / 3 weeks remote in rotation, up to 50% onsite in peak times
Language: English is a must (C1), German is a must (C1)
Budget: max. 85 EUR netto remote, 97,25 EUR netto all-in onsite
 
Team:
The local operations team for Germany is responsible for running the production platform in Germany which will host all productive business applications for Germany.
 
Tasks:
- CI/CD Maintainment and Operational Readiness
- Monitoring, Incident, Problem and Change Management in the specific context of providing managed Kubernetes
- Automation of operations critical standard processes following established software development lifecyles
- Security and Compliance Enforcement
 
The contractor must be a middle level professional with proven experience in operations management of private cloud solutions, proficiency in managing Kubernetes operations on the platform. Proven ability to structure operational topics including CI/CD processes, GitOps, quality assurance, and incident, problem and change management.
 
Skills (must-have):
- At least 3 years of operational experience with self-managed Kubernetes clusters, self-managed services providing Kubernetes clusters and productive applications or systems in on premise environments on Kubernetes
- Deep understanding of networking concepts, including protocols, load balancing, and security.
- Profound knowledge and implementation experience with CI/CD processes, tooling (e.g. GitLab, Jenkins, Tekton, Argo Workflows, and Argo CD), concepts and associated quality and security assurance for software delivery
- Fundamental understanding of core operations processes (incident management, change management, problem management, IT Service Management) as well as SRE concepts
- Experience in gathering operational insights from monitoring or observability including SLI/SLA/SLO management and tracking.
- Hand-on experience in documenting procedures properly and enforcing clear runbooks or playbooks.
- Hands-on experience with monitoring and logging tools (e.g., Prometheus, Grafana, Datadog, Mimir, Loki).

Stellenbeschreibung

Our use of cookies