DevJobs

Research Engineer

Overview
Skills
  • C++ C++
  • C C
  • Rust Rust
  • Deep learning Deep learning ꞏ 5y
  • TensorFlow TensorFlow
  • PyTorch PyTorch
  • Vision models
  • Triton
  • Quantization
  • Performance profiling
  • Optimization
  • Model serving
  • Memory optimization
  • Language models
  • Inference pipelines
  • Distributed training
  • Distillation
  • Diffusion models
  • CUDA
  • Compilation

We are looking for a Software Engineer specializing in Deep Learning to join Final’s research department.

While vision and language models have become increasingly commoditized, Final’s proprietary deep learning models are unique, fast-evolving, and deployed in live trading across the world’s most efficient and sophisticated financial markets. Operating in this environment presents distinct scaling challenges and continuous opportunities for optimization. Success in this role requires first-principles thinking and a deep understanding of the engineering trade-offs behind high-performance DL systems.

This is a pivotal role within Final’s research organization. You will work closely with researchers and engineers across the company, training deep learning models on massive compute clusters and adapting them for production serving under strict and non-trivial constraints.


Requirements:

  • B.Sc. with honors in CS/EE/Math/Physics, or a related field from a top-tier university
  • 5+ years of hands-on experience building and deploying large-scale deep learning systems in production
  • Advanced proficiency in PyTorch/TensorFlow


Preferred Qualifications:

  • M.Sc. or Ph.D. in a relevant quantitative field
  • Proficiency in C/C++/Rust
  • Deep, working knowledge of PyTorch internals

Strong experience in several of the following areas:

  • Performance profiling and optimization of deep learning workloads
  • Implementing custom CUDA/Triton kernels
  • Orchestrating and optimizing large-scale distributed training (hundreds to thousands of GPUs)
  • Optimizing model serving and inference pipelines (quantization, distillation, compilation, memory optimization, etc.)
  • Training and scaling state-of-the-art vision, language, or diffusion models

Final