Description
AI Platform Software Engineer
Location: Tel Aviv
#Hybrid
DriveNets is a leader in high-scale disaggregated networking solutions. Founded in 2015, DriveNets modernizes the way service providers, cloud providers, and hyperscalers build networks. Supporting the largest network in the world, more than half of AT&T's backbone traffic is running on DriveNets' Network Cloud open disaggregated architecture. Raising $587 million in three funding rounds, DriveNets is disrupting the networking market from high-scale architecture to AI platforms, and is bringing onboard the most talented people. We are seeking people that want to make an impact on the world's leading communication networks and are experienced in networking architecture or AI infrastructure solutions.
Job Summary
We are looking for a talented and motivated Software Engineer with hands-on experience building and operating multi-agent AI systems in production to join our DriveNets Automation Platform (DAP) team.
The team develops automation tools, orchestration capabilities, and intelligent platforms that simplify the deployment, management, troubleshooting, and optimization of large-scale network and AI infrastructure environments.
You will work on the design and development of AI-powered systems that bridge networking, automation, observability, and distributed infrastructure — running on Kubernetes at scale. Our stack includes LangGraph, Langfuse, RAG pipelines, MCP, and agent-to-agent (A2A) communication patterns.
This role combines strong software engineering with practical AI application development, with a sharp focus on production hardening, tracing, evaluation, and safety of agentic systems — not model training or research prototypes.
Requirements
- 5+ years of hands-on software engineering experience building production-grade backend services, APIs, or AI-powered systems.
- Proven production experience with multi-agent AI systems: deployment, tracing, guardrails, hardening, and incident management.
- Hands-on experience with agentic frameworks such as LangGraph, CrewAI, Google ADK, AutoGen, or equivalent.
- Experience building and running evaluation pipelines for agentic solutions - including trajectory tracing, ground truth validation, and harshness/quality scoring.
- Strong Python proficiency: comfortable building scalable backend services using gRPC and REST APIs.
- Solid understanding of distributed systems: fault tolerance, consistency models, service communication, and operational challenges at scale.
- Hands-on Kubernetes experience: deploying and operating containerized services, managing workloads, config, and scaling in production clusters.
- Practical experience with embeddings, vector databases, and semantic retrieval systems in production.
- Practical experience with RAG pipelines, LLM API integration, structured outputs, and tool calling in production environments including building and serving MCP servers at scale.
- Working knowledge of SQL and/or NoSQL databases, schema design, and query optimization.
- Strong debugging skills across application logic, APIs, data, and AI agent behavior.
- Strong communication skills and a bias toward ownership and delivery.
Nice to Have
- Familiarity with Langfuse/Arize Pheonix.
- Familarity with A2A & A2UI protocols.
- Experience with network automation, orchestration, or configuration management (Ansible, Terraform, NETCONF, gNMI, or similar).
DriveNets is on a mission to build a special company comprised of individuals with diverse backgrounds, perspectives, and experiences.
DriveNets is an equal opportunity employer. We do not discriminate based on upon race, religion, national origin, sexual orientation, gender identity, gender expression, age, status as a protected veteran, status as an individual with disability, or other applicable legally protected characteristics.