VALEO IT is looking for a Senior DevOps Engineer with solid technical, automation and programming experience.
We provide our customers with a flexible, open and standards-based cloud platform for global data exchange and a development toolchain platform to facilitate collaboration between multiple parties in the automotive industry.
- Platform operation: Keep EKS clusters and integrated services stable through maintenance, monitoring, metrics and rapid response to alerts and reported errors.
- CI/CD responsibility: Design and maintain end-to-end pipelines (mainly GitHub Actions; Jenkins if required) for platform and workload deployments.
- Configuration management on a large scale: Management of app/platform configurations across multiple AWS accounts and regions (global presence, including China).
- Development of observability: Creation of diagnostics, dashboards, logs/metrics/traces and actionable alerting for rapid detection and resolution of incidents.
- Security & Compliance: Implementation of secret management, key rotation, policy enforcement and cloud access monitoring; setup and maintenance of Guardrails.
- Cooperation with developers: Design of operational, reliability and scalability aspects; creation of clear runbooks and developer manuals.
- Automation first: Replace manual tasks with code; optimize deployment paths and evolve infrastructure for scale and performance.
- Incident Response & Recovery: Analysis, containment, management of root cause analyses/post-mortems and timely implementation of follow-up measures.
- Capacity & performance planning: Track KPIs/SLOs, perform capacity analysis and make recommendations on hardening and upgrades.
- Cloud & Orchestration: AWS/AWS China, EKS/Kubernetes, Docker; understanding of cloud-native architectures and distributed systems.
- Infrastructure as Code & Automation: Terraform/Terragrunt; Ansible if required; Helm for packaging; Git for version management and GitOps workflows.
- CI/CD & Release Engineering: Branching/merging strategies and artifact management with GitHub Actions and/or Jenkins.
- Network & Traffic: Sound Linux/AWS network knowledge (VPC, routing, security groups/NACLs), DNS, load balancing (ALB/NLB), Kubernetes Ingress; proxies (NGINX, HAProxy).
- Observability: Prometheus, Grafana, Loki (or comparable tools); creation of meaningful dashboards and alerts.
- Databases: Administration of SQL and NoSQL databases (Postgres, MySQL, DynamoDB).
- Programming & Scripting: Python and Bash; knowledge of Java an advantage.
- Governance: Practical experience in incident, problem and change management in production environments.
- Security: Kubernetes security (NetworkPolicy, Kyverno), secrets management and policy enforcement.
- Additional tools (nice-to-have): Experience with GitHub Enterprise, Gitea, Artifactory, Confluence, Jira, Grafana, Jenkins, Vault.
- Experience: 5+ years operating large scale production systems (load balancing, monitoring, configuration management).
- Automation mentality: You identify recurring tasks and automate them consistently.
- DevOps/GitOps mentality: They treat infrastructure as code and prefer transparent, repeatable workflows.
- Performance focus: They proactively identify bottlenecks and declines in performance.
- Data-driven operation: They are convinced that audits, KPIs and SLOs drive continuous improvement.
- Team player: You work across teams, proactively share knowledge and give and accept feedback openly.
- Clear communication: Excellent written and verbal skills for technical and procedural documentation.
- Education: Bachelor's degree or equivalent practical experience.
VALEO IT SERVICES SRL
Strada Salcâmilor No. 14-16 | 300425 Timișoara | Romania
E-Mail:Â career@valeo-it.com
