DevJobs

ML Real Time Optimization Engineer

Overview
Skills
  • Python Python ꞏ 3y
  • PyTorch PyTorch
  • TensorFlow TensorFlow
  • model compression
  • model distillation
  • pruning
  • quantization
  • TensorFlow Lite
Job Description:

Q's Machine Learning team is expanding, and we're looking for a Real-Time Optimization Machine Learning Engineer. In this role, you'll be responsible for optimizing our models to run efficiently on embedded devices. By reducing computational requirements and improving inference speed, you'll directly influence how users interact with their devices and play a key role in advancing our technology.

This position involves close collaboration and co-design with both software and hardware teams, making you the critical link between ML, software, and hardware divisions. To excel in this role, you'll need a deep understanding of ML models and the hardware they run on.

Responsibilities:

  • Develop new or apply existing performance and optimization techniques to on-device AI
  • Apply knowledge and research to advance the state-of-the-art in on-device machine learning frameworks
  • Collaborate with ML researchers and SW and HW developers to provide unique and proprietary high performance solutions

Qualifications:

  • 3+ years of hands-on experience coding in Python
  • Proficiency with model distillation, quantization, pruning, and related model compression techniques.
  • Extensive knowledge of PyTorch, TensorFlow, and TensorFlow Lite
  • Proven track record in developing or optimizing DL/ML algorithms for embedded devices.

*This role is fully onsite*
Q.ai