Large Scale Dataset Engineer

No longer accepting applications

Overview

Job TypeHybrid

Experience6 years

Job PositionAI/ML

UpdatedSep 01, 2024

LocationJerusalem District

SalaryN/A

Skills

Python ꞏ 6y
PyTorch
TensorFlow
Kubernetes
Clustering
Computer Vision
Infrastructure-as-Code
ML Models
Statistics
JAX

Lightricks, an AI-first company, is revolutionizing how visual content is created. With a mission to bridge the gap between imagination and creation, Lightricks is dedicated to bringing cutting-edge technology to the creative and business spaces. Our AI photo and video generation models, which power our apps and platforms including Facetune, Photoleap, Videoleap, and LTX Studio, allow creators and brands to leverage the latest research breakthroughs, offering endless control over their creative potential. Our influencer marketing platform, Popular Pays, provides creators the ability to monetize their work and offers brands opportunities to scale their content through tailored creator partnerships.

The Core Generative AI team at Lightricks Research is a unified group of researchers and engineers dedicated to developing our generative foundational models that serve LTX Studio, our AI-based video creation platform. Our focus is on creating a controllable, cutting-edge video generative model by merging cutting-edge algorithms with exceptional engineering. This involves enhancing machine learning components within our sophisticated internal training framework, crucial for developing advanced models. We specialize in both research and engineering that enable efficient and scalable training and inference, allowing us to deliver state-of-the-art AI-generated video models.

About The Role

As a Large Scale Dataset Engineer, you will play a key role in improving training efficiency by increasing both the quantity and quality of training data. This role demands excellent engineering skills for designing, implementing, and optimizing advanced data pipelines, alongside implementing robust machine learning and computer vision algorithms for data processing. Your expertise in optimizing the performance of distributed systems, understanding statistics, and eliminating bugs will be crucial, as our video training sets consist of extensive data volumes processed across numerous virtual machines.

This role is designed for individuals who are not only technically proficient but also deeply passionate about pushing the boundaries of AI and machine learning through innovative engineering and collaborative research.

What You Will Be Doing-

Own and lead engineering projects focused on data acquisition, processing, clustering, evaluation and filtering.
Design algorithms for balancing and filtering training sets.
Develop high-performance and scalable distributed systems capable of handling petabytes of data.
Collaborate with researchers and product stakeholders to iteratively improve training sets based on model performance.

Your Skills And Experience

6+ years of experience with small to large scale ML experiments and multi-modal ML pipelines.
Strong software engineering skills, proficient in Python and experienced with Kubernetes Infrastructure-as-Code.
Ability to develop, and fine-tune computer vision and ML models for data evaluation and filtering.
Understanding of relevant topics in statistics, clustering.
Enjoys delving into system implementations to enhance performance and maintainability.
Background in PyTorch/JAX/TensorFlow, or similar technologies is a plus.

Lightricks

Similar jobs

Senior C++ Engineer – AI Inference & Runtime Performance

Tel Aviv DistrictJun 22, 2026
Senior Data Scientist

Tel Aviv-YafoJun 16, 2026
Data Science & ML-Ops Team Lead

Tel Aviv-YafoJun 08, 2026
Experienced AI Software Engineer

Haifa DistrictJun 22, 2026
Data Science Manager

Tel Aviv-YafoJun 15, 2026
Senior Machine Learning Engineer I - GenAI Applications

Tel Aviv DistrictApr 09, 2026
Manager, Data Science & Research

Tel Aviv-YafoJun 26, 2026
AI Software Engineer

Center DistrictJun 22, 2026

Your Account

Your Account