AI Infrastructure & Inference Engineer (GPU Systems)
We usually respond within three days
The role
• Design, build and operate GPU infrastructure for training and low-latency inference (Kubernetes, autoscaling, CI/CD).
• Implement high-throughput model serving (e.g., NVIDIA Triton, vLLM, Text Generation Inference) with caching and canary releases.
• Optimize models and runtimes for cost, latency and throughput (quantization, distillation, batching, parallelism).
• Establish observability and reliability for inference (telemetry, tracing, SLOs/alerts, capacity planning, FinOps).
• Contribute to security, governance and compliance for models, artifacts and datasets (secrets, access, audit).
What you bring
• MSc/BSc in Computer Science, Electrical/Computer Engineering or similar.
• 5+ years in systems/infra/SRE or ML platform work at scale.
• Strong experience with containers/Kubernetes and IaC (Terraform or similar).
• Hands-on with GPU stacks (CUDA basics, NCCL, drivers/containers) and performance tuning.
• Familiarity with model serving frameworks (Triton, vLLM, TGI/HF), queues and service meshes.
• Proficiency in Python and one systems language (Go/C++ preferred); solid CI/CD and observability.
• Cloud experience (AWS/Azure/GCP) and cost optimisation for GPU workloads.
Nice to have
• Experience with distributed training/inference (tensor/pipeline parallelism, MIG, RDMA).
• Experience with feature stores/vector DBs for RAG-style serving.
• On-prem GPU cluster management (Slurm, DCGM) or hybrid cloud.
• Security certifications or practical experience with regulated environments.
- Department
- Information Technology
- Locations
- Sweden, Finland
LOOKING FOR SOMETHING MORE EXCITING AND CHALLENGING.......
If you are looking for a leap in your career or want to raise a step higher, access our candidate marketing services within CO-WORKER technology to increase the chances of your landing the right job opportunity as per your ambitions. If you are looking for further guidance or information about our candidate marketing services, feel free to approach our recruitment team.
About CO-WORKER TECHNOLOGY
Co-Worker Tech is a consulting and recruitment partner helping tech companies and industrial businesses grow sustainably. We combine rapid delivery with high quality and a clear people-first focus.