Applied RL Scientist

Overview

Job TypeHybrid

Experience unknown

Job PositionAI/ML

UpdatedJun 17, 2026

LocationIsrael

SalaryN/A

Skills

Python
PyTorch
Linear Algebra
Optimization
Probability
Reinforcement Learning
Statistics
Distributed Training
DPO
GRPO
OpenRLHF
PPO
RLHF
RLVR
TRL
verl

About the Job

Innodata's Frontier AI teams are pushing the boundaries of reinforcement learning applications—and RLVR (Reinforcement Learning with Verifiable Rewards) and RL Gyms in particular—to train, evaluate, and stress-test the world's most advanced AI models and agents. We're hiring an Applied RL Scientists to join our leading researchers, chief scientist, and VP for AI to design the algorithmic core of these systems, the implementation frameworks of RL environments, and to turn cutting edge research ideas into shipped pipelines on short timescales.

You will work side-by-side with our researcher team to design reward models, training objectives, data-generation strategies, and evaluation methodologies. You'll prototype them in code, run rigorous experiments, and collaborate with engineers to deploy what works into production. This is an applied research-heavy role for someone who can read a paper on Thursday and have a working implementation by Sunday.

What You'll Do

Help steer the algorithmic direction of our RL training environments, evaluation, and data-generation workflows.
Translate research ideas into working code—both internal prototypes and production-grade pipelines.
Design reward models, verifiers, and evaluation harnesses with defensible properties.
Run experiments, rigorously analyze results, and use findings to drive the next iteration.
Partner with engineers to operationalize the right algorithms at scale.
Stay current on the literature about RL, post-training, and evaluation, and bring in the most useful ideas quickly into production.

What You'll Bring

PhD (preferred) or MSc in Computer Science, Mathematics, Statistics, Machine Learning, or related fields.
Strong research background in reinforcement learning, ideally including exposure to RLHF, RLVR, DPO, or other post-training methods.
Hands-on experience implementing RL algorithms from scratch (PPO, GRPO, DPO, or similar).
Strong Python and PyTorch skills—comfortable writing custom training loops, not just using high-level wrappers.
Solid mathematical foundations: probability, statistics, optimization, linear algebra.
A track record of taking research from ideas to working code quickly.
Excellent English communication—you can explain a method clearly to engineers and a result clearly to partners.
Creativity and problem solving

Bonus Points

Publications at top ML venues (NeurIPS, ICML, ICLR, ACL, EMNLP).
Experience designing reward models, verifiers, or evaluation methodologies for LLMs.
Familiarity with distributed training infrastructure and large-scale experiments.
Open-source contributions to RL or LLM post-training libraries (TRL, OpenRLHF, verl, etc.).
Experience working closely with engineering teams to ship research into production.

Innodata

Similar jobs

Ph.D. intern- Computer Vision and Deep Learning algorithms

Tel Aviv DistrictJul 02, 2026
Algorithm Developer

Yehud MonossonJun 25, 2026
Deep Learning Research Student

RehovotJun 25, 2026
3D Algorithm Developer

Ramat GanJun 24, 2026
Senior ML Applied Research Scientist

Tel Aviv-YafoJun 24, 2026
AI Technology Lead

RaananaJun 22, 2026
IT AI Specialist

Petah TikvaJun 29, 2026
AI Engineer

Center DistrictJun 16, 2026

Your Account

Your Account