DevJobs

Site Reliability Engineer | AI Infrastructure

Overview
Skills
  • TypeScript TypeScript
  • Python Python
  • Linux Linux
  • CI/CD CI/CD
  • Azure Azure
  • AWS AWS
  • Kubernetes Kubernetes
  • Docker Docker
  • Networking Networking
  • Splunk
  • Terraform Terraform
  • Incident management
  • Datadog
  • CloudWatch
  • CloudFormation
  • Security
  • CDK
  • Evaluation pipelines
  • LLM API integration
  • Vector databases
Site Reliability Engineer | AI Infrastructure, TLV

Location: Tel Aviv, Israel (Hybrid, 3 days in-office)

Reports to: Head of AI Solutions and Innovations, JLL Technologies

About Us

JLL (NYSE: JLL) is a Fortune 500 commercial real estate company. JLL Technologies is its tech arm. The Tel Aviv office runs AI, data science, and ML work that feeds directly into how the business operates. Our teams build and run production AI systems used in decision-making across the company.

The Role

We're hiring for a new AI Engineering team in Tel Aviv, and you would be the first infrastructure hire. You will own the platform layer for AI agents the team builds: deployment architecture, observability, and production reliability.

The team's first two projects: an agent that automates internal governance processes (vendor reviews, security questionnaires, tool provisioning), and an agent that helps engineering teams prepare for architecture reviews. Both integrate with external APIs (LLM providers, OneTrust, ServiceNow), handle structured decision logic, and manage sensitive data flows with audit requirements.

Highlights

  • Greenfield, but with real constraints. You're building on Azure/AWS with enterprise security requirements. The challenge is designing deployment and observability for LLM-backed services. You need to track output quality, cost per invocation, and model drift.
  • Enterprise complexity, startup autonomy. Ownership and greenfield environment of a startup, with the integration challenges of a Fortune 200: connecting AI services to real enterprise systems.
  • More than infrastructure. Your core is SRE, but you'll also write agent code in TypeScript and Python, work with data pipelines, and ship features alongside the team.

What the Work Looks Like

AI Service Infrastructure - Design and maintain deployment and release infrastructure for AI agents. The stack is cloud-native (Azure/AWS), with services that call LLM APIs, connect to enterprise systems, and handle structured data.

Observability & Reliability - Build monitoring and observability for AI services. Ensure model response quality doesn't degrade silently by tracking errors, logging cost spikes, and monitoring upstream API changes.

Security & Compliance - These agents handle sensitive workflows with elevated security requirements. You will work with JLL's security team on standards, but you own how they're implemented in the infrastructure.

Developer Experience - Create tooling that makes it easy for the team to build, test, and deploy. The patterns you set become the team's defaults.

What We're Looking For

Required:

  • 5+ years in SRE, platform engineering, DevOps, or infrastructure roles, with experience owning infrastructure end-to-end
  • Strong experience with cloud platforms (Azure or AWS), containerization (Docker, Kubernetes), and CI/CD pipelines
  • Infrastructure-as-code experience (Terraform, CDK, or CloudFormation)
  • Monitoring and observability (Datadog, Splunk, CloudWatch, or similar)
  • Infrastructure fundamentals: Linux, networking, security
  • Incident management experience: on-call, production incidents, post-mortems
  • Comfortable working independently with broad ownership and high accountability
  • Strong written and verbal English for async collaboration with distributed teams

Preferred:

  • Experience with AI/ML infrastructure: model serving, LLM API integration, vector databases, or evaluation pipelines
  • Comfortable writing production code in TypeScript or Python, not just scripts
  • Experience building self-service developer tooling or internal platforms
  • Cost optimization for cloud and API-based workloads
  • Security engineering experience, especially in enterprise or compliance-heavy environments

Working Here

  • Hybrid: 3 days in-office. ~30 engineers and data scientists in the TLV office.
  • Team: Distributed across Israel, Czechia, the US, India, and Singapore.
  • Language: The team operates in English. Hebrew is the office language but not required for the role.

Compensation and Process

Base salary, RSUs (JLL is NYSE-listed, standard 4-year vest), keren hishtalmut, and annual bonus. We share range details early in the process.

Process: Initial conversation with the hiring manager, technical deep-dive (system design + live troubleshooting, no leetcode), team conversation. Typically 2-3 weeks end to end.
JLL