Manager, Software Engineering - AIOps - NVIDIA - Raanana

Manager, Software Engineering - AIOps

No longer accepting applications

Overview

Job TypeOn-site

Experience8 years

Job PositionAI/ML

UpdatedMay 15, 2026

LocationRaanana

SalaryN/A

Skills

Go
Python
CI/CD
Docker
Kubernetes
Networking
IaaS
PaaS
AutoGPT
DPUs
Ethernet switching
GPUs
LangChain
ML models
NVIDIA hardware stack

NVIDIA is at the forefront of the AI revolution, and the AIOps department is critical to ensuring our AI-driven data centers operate with unmatched efficiency. We are looking for a visionary, hands-on Software Engineering Manager to lead a team building the next generation of AI-based monitoring and operation platforms.

This role focuses on leveraging AI Agents to automate, predict, and optimize data center performance at an internet scale. If you are a resilient leader who excels in fast-paced environments and has a passion for autonomous system operations, we want you on our team.

What You’ll Be Doing

Strategic Roadmap Development: Define software design and implementation roadmaps for AI-driven operations, ensuring data center availability, resiliency, and performance through autonomous agent-based monitoring.
Innovative AIOps Engineering: Lead the development of tools and proof-of-concepts focused on software-defined operations, utilizing AI agents to automate root cause analysis and proactive remediation.
Scalable Architecture: Build and scale monitoring applications that handle massive telemetry data from AI infrastructure across public, private, and hybrid cloud environments.
Agentic Frameworks: Oversee the integration of LLM-based agents into CI/CD and operational workflows to shift from reactive monitoring to predictive orchestration.
Team Leadership: Actively hire, mentor, and grow a high-performing engineering team, fostering a culture of technical excellence and creative problem-solving.
Customer Engagement: Directly contribute to internal and external customer engagements to align AIOps solutions with real-world data center challenges.

What We Need To See

BS/MS degree in Computer Science or a related technical field (or equivalent experience).
8+ years of overall software engineering experience, with at least 2+ years in a management or technical lead role.
Domain Expertise: 3+ years of experience in system software engineering for large-scale production systems, with a strong background in Solution Design and Distributed Systems.
Cloud Native Mastery: Deep experience with Docker and Kubernetes orchestration, alongside PaaS or IaaS cloud platforms.
Programming Proficiency: Strong programming skills in Python (essential for AI/ML workflows) and Go.
Operational Intelligence: Extensive knowledge of CI/CD pipelines and automated software-defined operations.
Exceptional written and verbal communication skills to bridge the gap between complex AI logic and operational requirements.

Ways To Stand Out From The Crowd

AI/ML Background: Experience building or deploying AI Agents (LangChain, AutoGPT) or using ML models for anomaly detection and predictive analytics.
Infrastructure Knowledge: Familiarity with Ethernet switching, networking protocols, or NVIDIA’s hardware stack (GPUs/DPUs).
Control Systems: Experience in developing autonomous systems or closed-loop feedback monitoring tools.
SaaS Background: Proven track record of managing and scaling cloud-based SaaS applications.

, , JR2017429

Nvidia

Similar jobs

AI Acceleration Group Manager

Tel Aviv-YafoJun 23, 2026
Senior MLOps Engineer

Tel Aviv-YafoJun 03, 2026
Engineering Software Manager, AI Transformation- Platform (Idira)

Petah TikvaJun 02, 2026
Senior Machine Learning Engineer

North DistrictJun 01, 2026
Robotics Team Leader

Tel Aviv-YafoMay 27, 2026
Manager, AI and Software

Yokneam IlitApr 07, 2026
AI Software Architect

RaananaJun 14, 2026
AI Platform Engineer

Tel Aviv-YafoApr 01, 2026

Your Account

Your Account