DevJobs

Senior AI Engineer

Overview
Skills
  • Python Python
  • PyTorch PyTorch
  • Kafka Kafka
  • Elasticsearch Elasticsearch
  • PostgreSQL PostgreSQL
  • CI/CD CI/CD
  • AWS AWS
  • Kubernetes Kubernetes
  • Production LLM apps ꞏ 2y
  • Prompt engineering
  • Model serving
  • Monitoring
  • Multi-agent systems
  • Observability
  • OpenAI API
  • Orchestration
  • LangSmith
  • RAG
  • Task delegation
  • Tool-use
  • Transformers
  • Vector DBs
  • MLOps
  • Agent observability
  • LangGraph
  • LangChain
  • Fine-tuning
  • Failure handling
  • Failure detection
  • Experiment tracking
  • Embeddings
  • Drift detection
  • Cloud
  • AWS Bedrock
  • Anthropic API
  • Agents
  • AgentCore
  • Model eval
Description

Tel Aviv

  • Hybrid | Full-Time | Cybersecurity

What You'll Do

  • Design and develop LLM-powered security features and internal AI tools — RAG pipelines, multi-agent workflows, prompt-engineered systems for cybersecurity
  • Architect and operate multi-agent systems in production — orchestration, inter-agent communication, task delegation, failure handling at scale
  • Build agent monitoring and observability pipelines — tracing, drift/failure detection, alerting, reliability SLAs
  • Build and maintain scalable MLOps infrastructure — model serving, eval frameworks, experiment tracking, CI/CD for ML
  • Fine-tune and adapt foundation models on internal datasets (network telemetry, security logs, threat intel)
  • Establish best practices for model observability, safety, and responsible AI deployment
  • Stay current with the LLM/GenAI ecosystem; drive updates to the AI SDLC and AI Research cycle

tions 'from scratch'

Requirements

Must-Have:

  • 5–8 years SWE (2–3 in AI/ML)
  • Production LLM apps (RAG/agents/tool-use/fine-tuning)
  • Production multi-agent systems
  • Agent observability
  • LangChain/LangGraph/Bedrock AgentCore
  • Strong Python
  • MLOps pipelines
  • Transformers/embeddings/vector DBs
  • Cloud + K8s.

Nice-to-Have

  • Cybersecurity background (significant plus)
  • Networking (SDN/BGP)
  • Model eval (LLM-as-judge/RAGAS)
  • MCP
  • Telecom/enterprise SaaS
  • publications/OSS in GenAI.

Stack

Python, PyTorch, OpenAI/Anthropic APIs, LangChain, LangGraph, AWS Bedrock AgentCore, LangSmith, Kubernetes, Kafka, Elasticsearch, AWS, PostgreSQL
DriveNets