חדש באתר! העלו קורות חיים אנונימיים לאתר ואפשרו למעסיקים לפנות אליכם!
Why Kubiya AI?
Kubiya is building the agentic operating system for engineering, cloud and DevOps teams.
Our platform stitches together open-source LLMs, a Mixture-of-Experts runtime, infrastructure-native tooling (Kubernetes, Terraform, OTEL, JetStream) and a policy-first security layer to turn a single prompt into a fully-auditable, production-ready workflow. Fortune-100 design partners already collapse 300-step runbooks into “spin up an environment” conversations — no glue code, no pager fatigue.
⸻
The Opportunity
As AI Team Lead in the CTO’s office, you will be the technical owner of everything model-powered inside Kubiya:
• Architecture – Design the end-to-end pipeline that ingests org context, routes to the right expert model, executes code in sandboxed containers, and feeds rich telemetry back into our continuous-learning loop.
• Model Strategy – Decide when we fine-tune open-source Llama-3 vs. hot-swap to Bedrock or Vertex; benchmark MoE routers for latency and cost; champion vLLM/Triton for GPU efficiency.
• MLOps at Scale – Own versioning, lineage, policy gating and roll-back of models and in-line tools. Ship deterministic, reproducible releases that DevSecOps trusts.
• Tooling & Integrations – Work with backend and platform leads to expose new model endpoints through our Model Context Protocol (MCP) so agents can compose actions across GitHub, Jira, Terraform, Prometheus and more — without one-off plugins.
• Thought Leadership – Partner with the CTO on the technical roadmap, publish internal RFCs, mentor engineers and evangelize best practices across the company and open-source community.
⸻
What You’ll Do
• Craft cloud-native, micro-service architectures for training, fine-tuning and real-time inference (AWS/GCP/Azure, Kubernetes, JetStream).
• Define SLOs for p95 agent latency, model success rate, and telemetry coverage; instrument with OTEL, Prometheus and custom reward models.
• Drive our continuous-learning loop: reward modelling, ContextGraph enrichment, auto-tuning MoE routers.
• Embed least-privilege IAM and OPA/ABAC policy checks into every stage of the model lifecycle.
• Collaborate with product managers to translate customer pain into roadmap items and with design partners to validate solutions in production.
• Mentor a cross-functional squad of backend engineers, ML engineers and data scientists.
⸻
What You Bring
• 5+ years in software engineering, with 3+ years architecting large-scale backend systems (Python, Go, Java or similar).
• 4+ years designing, deploying and monitoring AI/ML systems in production.
• Deep expertise in at least one of: large-language-model serving, MoE routing, RLHF, vector search, streaming inference.
• Hands-on fluency with Kubernetes, Docker, CI/CD, IaC (Terraform/Helm) and distributed data technologies (Kafka, Spark, Arrow).
• Proven MLOps track record (MLflow, Kubeflow, SageMaker, or similar) and a security-first mindset.
• Ability to turn ambiguous business goals into a crisp, scalable architecture — and to communicate that vision to both executives and engineers.
⸻
Nice-to-Haves
• PhD or publications in ML/NLP/Systems.
• Contributions to open-source LLM or MLOps projects.
• Experience pushing real-time inference to the edge or FPGA/ASIC accelerators.
• Prior leadership of cross-functional AI/ML teams in a fast-growing startup environment.
⸻
The Way We Work
We value clarity, ownership, and velocity. You’ll have direct access to the CTO, autonomy to choose the right tech, and a front-row seat as we redefine how enterprises move “from prompt to production.” If building the Kubernetes of AI-driven operations excites you, let’s talk.