Frankfurt am Main, Remote
Job-ID:
76732
Job veröffentlicht am:
12-06-2026
Zusammenfassung
For our client we are looking for a Observability Operations Expert (f/m/d).
Start: 01.07.2026
Duration: 31.12.2026++
Capacity: 100%
Location: 75% Remote, 25% Frankfurt (occasionally, sometimes Berlin)
1 week Frankfurt / 3 weeks remote in rotation, up to 50% onsite in peak times
Language: English is a must (C1), German is a must (C1)
Budget: remote: 80,00 EUR net, onsite 92,25 EUR net all-in
Team:
The local operations team for Germany is responsible for running a production platform in Germany which will host all productive business applications for Germany.
Tasks:
- Monitoring, Incident, Problem and Change Management in the specific context of providing managed Kubernetes
- CI/CD Support and Operational Readiness
- Automation of operations critical standard processes following established software development lifecycles
- Security and Compliance Enforcement
Skills (must-have):
- At least 3 years of operational experience with self-managed Kubernetes clusters, self-managed services providing Kubernetes clusters and productive applications or systems in on premise environments on Kubernetes.
- Hands-on experience with monitoring and logging tools (e.g., Prometheus, Grafana, Datadog, Mimir, Loki, Open Telemetry collector) both from usage as well as administration/operations perspective.
- Deep understanding of networking concepts, including protocols, load balancing, and security.
- Profound knowledge and implementation experience with CI/CD processes, tooling (e.g. GitLab, Jenkins, Tekton, Argo Workflows, and Argo CD), concepts and associated quality and security assurance for software delivery
- Fundamental understanding of core operations processes (incident management, change management, problem management, IT Service Management) as well as SRE concepts.
- Experience in gathering operational insights from monitoring or observability including SLI/SLA/SLO management and tracking.
- Hands-on experience in documenting procedures properly and enforcing clear runbooks or playbooks.