DevJobs

AI-OPS Engineer – Infrastructure (Azure Focus)

Overview
Skills
  • Bash Bash
  • Python Python
  • Linux Linux
  • Azure Azure ꞏ 4y
  • Docker Docker
  • Kubernetes Kubernetes
  • Networking Networking
  • Terraform Terraform
  • AKS
  • Cloud security
  • Infrastructure as Code

AI-OPS Engineer – Infrastructure (Azure Focus)

Department: IT Infrastructure – Tools & Collaboration

We are opening a new position in our Infrastructure organization and establishing a dedicated AI-OPS function.

This is a foundational role with real ownership, hands-on responsibility, and close collaboration with development and AI teams.

As an AI-OPS Engineer, you will be responsible for the day-to-day operation, stability, security, and scalability of AI infrastructure, across cloud and platform layers, with a strong focus on Azure-based environments.

This role sits at the intersection of Cloud Infrastructure, DevOps, Security, and AI Platforms.


Key Responsibilities

  • Operate and maintain AI infrastructure environments (primarily Azure), ensuring availability, performance, and scalability
  • Work closely with developers and AI teams to resolve infrastructure-related issues: permissions, security, networking, deployments, and access
  • Support deployment, operation, and troubleshooting of AI models and services
  • Design and operate Azure-based AI infrastructure, including end-to-end Azure Functions and APIs for AI models (Azure AI Factory)
  • Manage Kubernetes / AKS and Docker environments supporting AI workloads
  • Implement and maintain Infrastructure as Code using Terraform
  • Monitor systems, investigate incidents, and perform root-cause analysis
  • Handle cloud security configurations, access control, and compliance requirements
  • Participate in defining infrastructure methodologies and best practices for AI workloads
  • Take part in infrastructure consolidation and centralization initiatives


Required Experience

  • 4–8 years of experience in Cloud Infrastructure Operations, with a strong background in Azure - Must
  • Hands-on experience with Linux administration - Must
  • Proven experience with Terraform and Infrastructure as Code - Must
  • Experience with Kubernetes / AKS and Docker - Must
  • Solid understanding of cloud networking and cloud security - Must
  • Experience working closely with development teams and debugging cloud issues related to development and deployment - Must
  • Scripting experience with Python and/or Bash - Must
  • Strong troubleshooting and problem-solving skills
  • Certifications in Azure or Kubernetes
  • AI is a relatively new domain in the organization — hands-on experience is a plus, but strong infrastructure background with AI exposure or familiarity is acceptable

Unilink Ltd.