We are seeking a highly skilled and motivated
MLOps Engineer to join our team and play a critical role in the deployment, automation, and maintenance of our machine learning infrastructure. You will be responsible for ensuring that our ML models move seamlessly from experimentation to production with stability, security, and scalability in mind.
You will work closely with data scientists, machine learning engineers, and DevOps teams to build robust model pipelines, manage infrastructure across on-prem and cloud, and enforce best practices in version control, model deployment, and compliance.
Key Responsibilities:
- Design, implement, and maintain automated model training pipelines using tools such as MLflow, (Kubeflow, Airflow), or custom orchestrators.
- Support reproducibility and consistency across model training environments.
- Implement and manage Artifactory.
- Establish and maintain model registries to track versions, metadata, and lineage.
- Automate model promotion through staging, testing, and production environments.
- Build CI/CD pipelines tailored for ML use cases, integrating model training, validation, deployment, and rollback.
- Manage and scale infrastructure across cloud platforms (Azure) and on-premise environments.
- Optimize GPU/CPU resource utilization and cost efficiency.
- Implement auto-scaling and load balancing strategies for ML workloads.
- Manage version control systems (e.g., Git) and integrate with experiment tracking tools.
- Handle storage and retrieval of artifacts (e.g., Docker images, models, datasets) via artifact registries like JFrog Artifactory or AWS ECR.
Requirements:
Qualifications:
- 3+ years of experience in MLOps, DevOps, or related engineering roles.
- Hands-on experience with ML pipelines and orchestration tools (e.g., MLflow, Airflow, Kubeflow).
- Proficiency with Docker, Podman, Kubernetes, and containerized deployments.
- Experience with cloud platforms Azure (AWS/GCP) and hybrid infrastructure setups (on-prem + cloud).
- Strong understanding of model versioning, packaging, and deployment best practices.
- Solid knowledge of Git, CI/CD tools (e.g., Jenkins, GitHub Actions, BitBucket pipelines), and monitoring stacks.
- Proficiency in Python, Bash, and infrastructure tools.