DevJobs

Tech Lead, Core Platform & ML Infrastructure- JFrog ML

Overview
Skills
  • Java Java
  • Go Go
  • Kubernetes Kubernetes
  • Distributed Systems ꞏ 7y
  • Cloud-Native Infrastructure
  • Container Orchestration
  • Model Versioning
  • Optimization
  • Performance Debugging
  • Vector DBs
  • GPU Orchestration
  • ML Model Serving
At JFrog, we’re reinventing DevOps and MLOps to help the world’s greatest companies innovate – and we want you along for the ride. Thousands of customers, including most of the Fortune 100, trust JFrog to manage their software supply chains - a concept we call Liquid Software.

We are looking for a hands-on Tech Lead to join the Core Platform team within JFrog ML. Our engineering teams build the foundational systems behind global artifact storage, replication, and distribution - and increasingly power the next generation of AI/ML operations and governance.

Our platform is the backbone for ML workloads: managing model binaries, versioning, and scalable runtime environments for ML and AI applications. This role combines deep distributed systems with modern ML infrastructure challenges such as high-throughput inference, safe model rollouts, and multi-cloud GPU efficiency. You will also help evolve core libraries and developer-facing tools, including logging, observability, and visibility components.

As a senior technical leader, you will influence architecture across squads, lead complex development efforts, and remain heavily hands-on.

As a Tech Lead in Core Platform in JFrog you will…

  • Design and evolve components for managing and distributing ML/AI models and artifacts at scale
  • Extend the platform to support reliable, high-performance inference and training workflows
  • Lead cross-team technical initiatives and serve as a reference for distributed systems and ML infra design
  • Write maintainable, high-quality code in performance-critical areas.
  • Mentor engineers and drive strong engineering practices
  • Collaborate with adjacent teams to ensure seamless end-to-end ML platform behavior
  • Improve the reliability, efficiency, and observability of core services

To be a Tech Lead in Core Platform in JFrog you need...

  • 7+ years building large-scale backend or distributed systems
  • Strong foundation in distributed systems (consistency, replication, concurrency, fault tolerance)
  • Proficiency in Java / Go or similar languages
  • Hands-on experience with high-performance, scalable, and reliable systems
  • Ability to lead design discussions and influence technical direction across teams
  • Curiosity and willingness to work with ML systems and workload patterns
  • Experience with Kubernetes, container orchestration, or cloud-native infrastructure
  • Thrive in a collaborative, ownership-driven engineering culture

Bonus Points

  • Experience with ML model serving, vector DBs, model versioning, or GPU orchestration
  • Background in secure software supply chain workflows
  • Strong performance debugging and optimization skills
JFrog