DevJobs

Senior ML Research Engineer

Overview
Skills
  • LLMs
  • transformer-based models
  • tensor parallelism
  • synthetic data generation
  • seq2seq models
  • retrieval-augmented
  • quantization
  • preference learning
  • pipeline parallelism
  • PEFT
  • multi-node clusters
  • multi-GPU
  • LoRA
  • caching
  • Large Language Models
  • instruction tuning
  • FSDP
  • encoder-decoder
  • domain adaptation
  • distillation
  • DeepSpeed
  • decoder-only
  • data pipelines
  • data augmentation
  • contrastive learning
  • continual pre-training
Key Responsibilities

Your Team

Join Check Point’s AI research group, a cross-functional team of ML engineers, researchers and security experts building the next generation of AI-powered security capabilities. Our mission is to leverage large language models to understand code, configuration, and human language at scale, and to turn this understanding into security AI capabilities which will drive Checkpoint AI future security solutions.

We foster a hands-on, research-driven culture where you’ll work with large-scale data, modern ML infrastructure, and a global product footprint that impacts over 100,000 organizations worldwide.

Qualifications

Your Impact & Responsibilities

As a Senior ML Research Engineer, you will be responsible for the end-to-end lifecycle of large language models: from data definition and curation, through training and evaluation, to providing robust models that can be consumed by product and platform teams.

  • Own training and fine-tuning of LLMs / seq2seq models:

design and execute training pipelines for transformer-based models (encoder-decoder, decoder-only, retrievalaugmented, etc.), and fine-tune open-source LLMs on Check Point–specific data (security content, logs, incidents, customer interactions).

  • Apply advanced LLM training techniques such as instruction tuning, preference / contrastive learning, LoRA / PEFT, continual pre-training, and domain adaptation where appropriate.
  • Work deeply with data: define data strategies with product, research and domain experts; build and maintain data pipelines for collecting, cleaning, de-duplicating and labeling large-scale text, code and semi-structured data; and design synthetic data generation and augmentation pipelines.
  • Build robust evaluation and experimentation frameworks: define offline metrics for LLM quality (task-specific accuracy, calibration, hallucination rate, safety, latency and cost); implement automated evaluation suites (benchmarks, regression tests, redteaming scenarios); and track model performance over time.
  • Scale training and inference: use distributed training frameworks (e.g. DeepSpeed, FSDP, tensor/pipeline parallelism) to efficiently train models on multi-GPU / multi-node clusters, and optimize inference performance and cost with techniques such as quantization, distillation and caching.
  • Collaborate closely with security researchers and data engineers to turn domain knowledge and threat intelligence into high-value training and evaluation data, and to expose your models through well-defined interfaces to downstream product and platform teams.
Check Point Software Technologies