For our client we are looking for an IAM Operations Engineer (f/m/d) private cloud.
Start: 04.05.2026
Duration: 31.12.2026 + wish for a long-term prolongation
Capacity: 100%
Location: 75% Remote, 25% Frankfurt or Berlin (1 week Frankfurt / 3 weeks remote in rotation), up to 50% onsite in peak times
Language: English is a must, German is a must (both C1)
Role:
Local Operations manages the on-premises production platform, which serves as the primary host for all mission-critical business applications.
Local operations are responsible for the following core areas:
• Platform Stability: Ensuring the high availability and performance of the on-premises private cloud environment.
• Application Hosting: Consulting on the seamless operation of Germany-specific productive business applications.
• Incident Management: Resolving technical issues within standard business hours to minimize operational downtime.
• Lifecycle Maintenance: Executing routine updates, patches, and system optimizations within the local infrastructure
Objectives:
- Consulting for CI/CD pipelines and ensure operational readiness for deployments
- Ensure operational stability and responsiveness for the managed Kubernetes platform
- Reduce operational toil and improve service reliability
- Ensure platform operations adhere to security and compliance standards
Skills (must-have):
- Proven operations experience with managed Kubernetes clusters, and productive applications or systems in on premise environments on Kubernetes
- Operational experience with Keycloak, understanding of OIDC, SAML oAuth2 and identity and access management workflows in cloud platforms (user onboardings, roles and permission management)
- Deep understanding of networking concepts, including protocols, load balancing, and security.
- Profound knowledge and implementation experience with CI/CD processes, tooling (e.g. GitLab, Jenkins, Tekton, Argo Workflows, and Argo CD), concepts and associated quality and security assurance for software delivery
- Fundamental understanding of core operations processes (incident management, change management, problem management, IT Service Management) as well as SRE concepts
- Experience in gathering operational insights from monitoring or observability including SLI/SLA/SLO management and tracking.
- Hand-on experience in documenting procedures properly and enforcing clear runbooks or playbooks.
- Hands-on experience with monitoring and logging tools (e.g., Prometheus, Grafana, Datadog, Mimir, Loki).