חדש באתר! העלו קורות חיים אנונימיים לאתר ואפשרו למעסיקים לפנות אליכם!
AI Architect — Office of the CTO
Location: Tel Aviv / Ramat Gan (Hybrid) or Remote | Department: Technology, Office of the CTO | Reports to: Chief Technology Officer
⸻
Why Kubiya AI?
Kubiya is building the agentic operating system for engineering, cloud and DevOps teams.
Our platform stitches together open-source LLMs, a Mixture-of-Experts runtime, infrastructure-native tooling (Kubernetes, Terraform, OTEL, JetStream) and a policy-first security layer to turn a single prompt into a fully-auditable, production-ready workflow. Fortune-100 design partners already collapse 300-step runbooks into “spin up an environment” conversations — no glue code, no pager fatigue.
⸻
The Opportunity
As AI Architect in the CTO’s office, you will be the technical owner of everything model-powered inside Kubiya:
• Architecture – Design the end-to-end pipeline that ingests org context, routes to the right expert model, executes code in sandboxed containers, and feeds rich telemetry back into our continuous-learning loop.
• Model Strategy – Decide when we fine-tune open-source Llama-3 vs. hot-swap to Bedrock or Vertex; benchmark MoE routers for latency and cost; champion vLLM/Triton for GPU efficiency.
• MLOps at Scale – Own versioning, lineage, policy gating and roll-back of models and in-line tools. Ship deterministic, reproducible releases that DevSecOps trusts.
• Tooling & Integrations – Work with backend and platform leads to expose new model endpoints through our Model Context Protocol (MCP) so agents can compose actions across GitHub, Jira, Terraform, Prometheus and more — without one-off plugins.
• Thought Leadership – Partner with the CTO on the technical roadmap, publish internal RFCs, mentor engineers and evangelize best practices across the company and open-source community.
⸻
What You’ll Do
• Craft cloud-native, micro-service architectures for training, fine-tuning and real-time inference (AWS/GCP/Azure, Kubernetes, JetStream).
• Define SLOs for p95 agent latency, model success rate, and telemetry coverage; instrument with OTEL, Prometheus and custom reward models.
• Drive our continuous-learning loop: reward modelling, ContextGraph enrichment, auto-tuning MoE routers.
• Embed least-privilege IAM and OPA/ABAC policy checks into every stage of the model lifecycle.
• Collaborate with product managers to translate customer pain into roadmap items and with design partners to validate solutions in production.
• Mentor a cross-functional squad of backend engineers, ML engineers and data scientists.
⸻
What You Bring
• 8+ years in software engineering, with 3+ years architecting large-scale backend systems (Python, Go, Java or similar).
• 4+ years designing, deploying and monitoring AI/ML systems in production.
• Deep expertise in at least one of: large-language-model serving, MoE routing, RLHF, vector search, streaming inference.
• Hands-on fluency with Kubernetes, Docker, CI/CD, IaC (Terraform/Helm) and distributed data technologies (Kafka, Spark, Arrow).
• Proven MLOps track record (MLflow, Kubeflow, SageMaker, or similar) and a security-first mindset.
• Ability to turn ambiguous business goals into a crisp, scalable architecture — and to communicate that vision to both executives and engineers.
⸻
Nice-to-Haves
• PhD or publications in ML/NLP/Systems.
• Contributions to open-source LLM or MLOps projects.
• Experience pushing real-time inference to the edge or FPGA/ASIC accelerators.
• Prior leadership of cross-functional AI/ML teams in a fast-growing startup environment.
⸻
The Way We Work