DevJobs

Deep Learning Software Engineer

Overview
Skills
  • Deep learning Deep learning
  • Azure Azure
  • GCP GCP
  • AWS AWS
  • Docker Docker
  • Kubernetes Kubernetes
  • Containerization-related technologies
  • containers
  • CUDA
  • Cloud Computing Platforms
  • GPU Programming
  • GStreamer
  • ROS
  • Stream Processing Frameworks
  • Triton
  • Unix-based OS internals
  • Software Engineering
We’re looking for a deep learning software engineer who is versatile, curious, independent, and keen to embark on true challenges. As part of the Deci AI Inference Team, you will play a crucial role in optimizing and deploying deep learning models for real-time inference in various applications. You will work closely with our research scientists, software engineers, and product team to ensure efficient and accurate model deployment - from microprocessors to multi-accelerator cloud instances. You will utilize cutting edge hardware and state-of-the-art models, all while practicing software development best practices.

To succeed in this role, you must possess a thorough understanding of both software engineering principles and deep learning theory. Your contributions will encompass the development of Deci’s core products - enabling graph compilation, runtime optimization, model deployment and more - all aimed at squeezing the most out of Deci’s customers’ HW

Requirements:

  • Familiarity with highly concurrent systems and their SW stacks - GPUs, DL accelerators, CUDA, Triton, etc.
  • Extensive experience deploying deep neural models to production settings - on cloud or edge devices
  • Track record of profile-based performance analysis, methodical discovery of bottlenecks, and general hardware utilization maximization
  • Knowledge of common SOTA deep learning architectures, their pre and post-processing transformations, and the relevance of these transformations to different deep learning tasks
  • A deep understanding of the transformer architecture, the latest attention mechanisms (FlashAttention, PagedAttention, …), and the LLM optimization and serving space is a large bonus
  • Familiarity with cloud computing platforms (AWS, GCP, Azure) and knowledge of containerization-related technologies (Docker, Kubernetes, containers)

Preferred qualifications:

  • Familiar with highly concurrent systems, GPU programming, CUDA and the CUDA Toolkit
  • Extensive experience deploying deep neural models to production settings - on cloud or edge devices
  • Track record of profile-based performance analysis, methodical discovery of bottlenecks, and general hardware utilization optimization
  • Familiar with stream processing frameworks (GStreamer, ROS, …) and efficient data-management techniques that they leverage
  • Knowledge of common SOTA deep learning architectures, their pre and post processing data transformations, and the relevance of these transformations to different deep learning tasks
  • Deep understanding of Unix-based OS internals - from process and thread management to the virtual memory system and file systems

Responsibilities:

  • Develop the core logic behind Deci’s DL development platform. Contribute to inference libraries and internal model optimization tools used by Deci’s customers, researchers, and algorithm teams
  • Develop strategies and infrastructure aimed at improving the reliability of Deci’s deep learning systems
  • Develop and deliver production-grade, high-throughput real time inference-enabling frameworks
  • Adopt and integrate cutting-edge research - in the fields of deep learning model optimization and deployment - into Deci products and tools
Deci AI