DevJobs

Senior ML Research Engineer

Overview
Skills
  • Python Python
  • PyTorch PyTorch
  • TensorFlow TensorFlow
  • Spark Spark
  • CI/CD CI/CD
  • Airflow Airflow
  • Transformer architectures
  • Code review
  • Tokenization
  • Testing
  • Version control
  • Distributed training
  • Positional encodings
  • Experiment tracking
  • Attention mechanisms
  • Job orchestration
  • Weights & Biases
  • Vector databases
  • Quantization
  • Retrieval-augmented generation
  • RLHF
  • SageMaker
  • LoRA
  • PEFT
  • MLflow
  • Kubeflow
  • FSDP
  • Embedding training
  • Distillation
  • Dense retrieval
  • DeepSpeed
  • Dask
  • Caching
  • BERT
  • Beam
  • Argo
Why Join Us?

Join Check Point’s AI research group, a cross-functional team of ML engineers, researchers and security experts building the next generation of AI-powered security capabilities. Our mission is to leverage large language models to understand code, configuration, and human language at scale, and to turn this understanding into security AI capabilities which will drive Checkpoint AI future security solutions.

We foster a hands-on, research-driven culture where you’ll work with large-scale data, modern ML infrastructure, and a global product footprint that impacts over 100,000 organizations worldwide.

Key Responsibilities

As a Senior ML Research Engineer, you will be responsible for the end-to-end lifecycle of large language models: from data definition and curation, through training and evaluation, to providing robust models that can be consumed by product and platform teams.

  • Own training and fine-tuning of LLMs / seq2seq models: Design and execute training pipelines for transformer-based models (encoder-decoder, decoder-only, retrievalaugmented, etc.), and fine-tune open-source LLMs on Check Point–specific data (security content, logs, incidents, customer interactions).
  • Apply advanced LLM training techniques such as instruction tuning, preference / contrastive learning, LoRA / PEFT, continual pre-training, and domain adaptation where appropriate.
  • Work deeply with data: define data strategies with product, research and domain experts; build and maintain data pipelines for collecting, cleaning, de-duplicating and labeling large-scale text, code and semi-structured data; and design synthetic data generation and augmentation pipelines.
  • Build robust evaluation and experimentation frameworks: define offline metrics for LLM quality (task-specific accuracy, calibration, hallucination rate, safety, latency and cost); implement automated evaluation suites (benchmarks, regression tests, redteaming scenarios); and track model performance over time.
  • Scale training and inference: use distributed training frameworks (e.g. DeepSpeed, FSDP, tensor/pipeline parallelism) to efficiently train models on multi-GPU / multi-node clusters, and optimize inference performance and cost with techniques such as quantization, distillation and caching.
  • Collaborate closely with security researchers and data engineers to turn domain knowledge and threat intelligence into high-value training and evaluation data, and to expose your models through well-defined interfaces to downstream product and platform teams.

Qualifications

What You Bring

  • 5+ years of hands-on work in machine learning / deep learning, including 3+ years focused on NLP / language models.
  • Proven track record of training and fine-tuning transformer-based models (BERT-style, encoder-decoder, or LLMs), not just consuming hosted APIs.
  • Strong programming skills in Python and at least one major deep learning framework (PyTorch preferred; TensorFlow).
  • Solid understanding of transformer architectures, attention mechanisms, tokenization, positional encodings, and modern training techniques.
  • Experience building data pipelines and tools for large-scale text / log / code processing (e.g. Spark, Beam, Dask, or equivalent frameworks).
  • Practical experience with ML infrastructure, such as experiment tracking (Weights & Biases, MLflow or similar), job orchestration (Airflow, Argo, Kubeflow, SageMaker, etc.), and distributed training on multi-GPU systems.
  • Strong software engineering practices: version control, code review, testing, CI/CD, and documentation.
  • Ability to own research and engineering projects end-to-end: from idea, through prototype and controlled experiments, to models ready for integration by product and platform teams.
  • Good communication skills and the ability to work closely with non-ML stakeholders (security experts, product managers, engineers).

Nice to have

  • Experience with RLHF / preference optimization, safety alignment, or other humanfeedback-in-the-loop approaches to training LLMs.
  • Experience with retrieval-augmented generation (RAG), dense retrieval, vector databases, and embedding training.
  • Background in security / cyber domains such as threat detection, malware analysis, logs, or SOC tools.
  • Experience with multilingual models (e.g., Hebrew + English) and cross-lingual training.
  • Experience in a product environment where models must meet reliability, scale, and cost constraints.

Why Join Us

  • Work at the intersection of cutting-edge AI and real-world cyber security, with immediate impact on global customers.
  • Own large-scale ML training and evaluation in a production setting, with a focus on research and model quality rather than agent development.
  • Collaborate with experienced ML engineers, researchers, and security experts in a fastmoving, supportive environment.
  • Access to modern GPU infrastructure and large, unique datasets from one of the world’s leading cyber security vendors.
Check Point Software Technologies