For our client we are looking for an Operations Specialist (f/m/d) Network & Security.
Duration: 31.12.2026++
Capacity: 100%
Location: 75% Remote, 25% Frankfurt or Berlin (1 week Frankfurt / 3 weeks remote in rotation), up to 50% onsite in peak times
Language: English is a must, German is a must (both C1)
Local Operations manages the on-premises production platform, which serves as the primary host for all mission-critical business applications. Local operations are responsible for the following core areas:
- Platform Stability: Ensuring the high availability and performance of the on-premises private cloud environment.
- Application Hosting: Consulting on the seamless operation of Germany-specific productive business applications.
- Incident Management: Resolving technical issues within standard business hours to minimize operational downtime.
- Lifecycle Maintenance: Executing routine updates, patches, and system optimizations within the local infrastructure.
Objectives:
- Provide Tier-3 operational ownership for Network & Security services for Local Production (DE).
- Ensure operational readiness for deployments
- Ensure operational stability and responsiveness for the managed Network & Security
- Reduce operational toil and improve service reliability
- Ensure platform operations adhere to security and compliance standards
Skills (must-have):
- 5+ years operating enterprise networks in private cloud / data center production.
- Proven experience implementing/leading Incident, Problem, Change, Release governance in production.
- Proven incident response and troubleshooting skills across routing, firewalling, connectivity, and service exposure patterns.
- Experience within the field of security fundamentals and operational enforcement in production contexts
- Experience in Networking: WAN/LAN, routers, firewalls (enterprise operations).
- Connectivity services: Tenant private networking / connectivity patterns supporting production platforms.
- Expertise and fundamental understanding of core operations processes (incident management, change management, problem management, IT Service Management) as well as SRE concepts
- Experience in DNS & certificates: DNS operations and certificate lifecycle handling (issuance/renewal/rotation coordination).
- ITSM tooling: Jira Service Management (JSM), Jira, Confluence (for workflow and documentation).
- Experience in gathering operational insights from monitoring or observability including SLI/SLA/SLO management and tracking.
- Hand-on experience in documenting procedures properly and enforcing clear runbooks or playbooks.
- Observability Hands-on experience with monitoring and logging tools (e.g., Prometheus, Grafana, Datadog, Mimir, Loki).
- Familiarity with enterprise DevOps toolchains is a plus (GitLab, JFrog Artifactory, Backstage, Harness).
- Strong understanding of modern platform operations (Kubernetes/containers, automation, observability), sufficient to govern specialists.
- Platform delivery concepts: GitOps and IaC awareness (Terraform/OpenTofu, ArgoCD, Helm) to govern deployment/readiness standards.
- Expertise in root cause analytics and troubleshooting
Skills (should-have):
- Experience operating in regulated / high-availability industries (banking, telco, public sector, healthcare).
- Experience with enterprise ITSM and change governance in regulated environments.
- Experience with SRE practices (SLOs/SLIs, error budgets) and reliability management.
- IaC/GitOps: Terraform/OpenTofu; Helm/ArgoCD familiarity helpful when network/security configuration is deployed via pipelines.