DevOps Engineer – AI Infrastructure
DYNE · Doha et périphérie
Job description
About the role
We are looking for an experienced DevOps Engineer to lead the design, deployment, and operation of AI‑focused infrastructure. The role focuses on Linux‑based, GPU‑enabled environments and the production delivery of large language model (LLM) services in secure, on‑premise settings.
Key responsibilities
- Lead deployment and management of Linux‑based AI infrastructure environments.
- Configure and maintain KVM virtualization platforms and GPU‑enabled systems.
- Deploy and manage containerized LLM serving environments in production.
- Design scalable and secure Kubernetes‑based infrastructure for AI workloads.
- Implement CI/CD pipelines, automation frameworks, and infrastructure‑as‑code practices.
- Apply security hardening, RBAC, encryption, secrets management, and audit‑ready controls.
- Monitor GPU utilization, infrastructure health, and system performance.
- Support high availability, disaster recovery, backup, and failover strategies.
- Troubleshoot infrastructure, GPU runtime, networking, and platform stability issues.
- Prepare technical documentation, architecture diagrams, and operational runbooks.
Required profile
- 5+ years of experience in DevOps, Platform Engineering, or Infrastructure Engineering.
- Strong Linux administration experience, including networking, storage, security hardening, and performance tuning.
- Hands‑on experience with NVIDIA GPUs (H100, A100, H200 or equivalent) and CUDA drivers, GPU runtimes, scheduling, and AI inference optimization.
- Experience deploying production‑grade LLM serving platforms such as vLLM, TensorRT‑LLM, or Triton Inference Server.
- Proficiency with Docker, Kubernetes, and containerized AI workloads.
- Experience with Infrastructure‑as‑Code tools like Ansible.
- Knowledge of DevSecOps practices, secure CI/CD pipelines, SAST integration, and secrets management.
- Experience with PostgreSQL, vector databases (e.g., Qdrant), and observability tools.
- Background working in on‑premise, air‑gapped, or regulated enterprise environments.
Required skills
- Linux administration
- KVM virtualization
- NVIDIA GPUs (H100, A100, H200)
- CUDA drivers
- GPU scheduling
- AI inference optimization
- vLLM
- TensorRT‑LLM
- Triton Inference Server
- Docker
- Kubernetes
- Ansible
- CI/CD pipelines
- DevSecOps
- SAST
- Secrets management
- PostgreSQL
- Qdrant (vector database)
- Observability tools
Questions fréquentes
Why are you reporting this job?
Apply in 30 seconds
Enter your email to apply. An account will be created automatically.
By continuing, you accept our terms of use.
Already have an account? Login
Published 4 hours ago
Expires 1 month from now
8 views · 0 applications
Boost your chances
Upload your CV — we will match you with relevant openings.
Analyzing your CV...
DYNE
Doha et périphérie
Related job offers
-
Technical Architect - MarTech
Qatar Airways Doha et périphérie -
Application Specialist
AMAN HOSPITAL Doha et périphérie -
Project Manager – Digital Twin
Parsons Corporation Doha et périphérie -
Senior Consultant – Freelance AI Project (MBB & Top‑Tier Firms)
Mindrift Qatar -
IT Project Manager – Asset & Wealth Management
Net2Source Inc. Doha