DevJobs

Senior Software Engineer (ML), Data Plane

Overview
Skills
  • C C
  • C++ C++
  • PyTorch PyTorch
  • DynamoDB DynamoDB
  • Linux Linux
  • CUDA
  • JAX
  • RDMA
  • SGLang
  • TensorRT
  • TorchXLA
  • vLLM
Description

The MLIL DataPlane team is looking for a Senior Software Development Engineer to own the design and implementation of our inference data plane. We build the software that makes large models run efficiently on custom hardware - spanning model execution, memory management, data movement, and serving integration.

Our work covers the full inference path: integrating serving engines with custom hardware, developing high-performance compute kernels, enabling efficient data movement, and driving models from early validation through production. We operate at frontier scale with large distributed models.

This is a ground-up effort with rapidly evolving hardware and software. We need a senior IC who can write and optimize low-level code for custom hardware, validate model architectures end-to-end, build test and profiling infrastructure, and drive performance across the stack.

Key job responsibilities

  • Develop and optimize compute kernels for a custom ML accelerator architecture, targeting production-level performance for large language model inference.
  • Implement and validate LLM architectures (decoder-only, mixture-of-experts) end-to-end - from PyTorch model definition through distributed execution on custom hardware.
  • Integrate custom accelerator backends into open-source ML serving frameworks (vLLM, PyTorch), including scheduler extensions, memory management, and model parallelism.
  • Build and maintain test infrastructure for model correctness validation across CPU, GPU, simulator, and hardware targets.
  • Profile and optimize inference workloads - identify bottlenecks, instrument critical paths, and drive latency and throughput improvements from simulation through hardware bringup.
  • Own features end-to-end: from design through implementation, testing, and integration into the broader software stack.
  • Contribute to CI/CD pipelines that gate model and kernel changes on correctness and performance regressions.
  • Mentor engineers, drive design reviews, and raise the engineering bar across the team.

Basic Qualifications

  • Bachelor's degree in computer science or equivalent
  • 7+ years of full software development life cycle, including coding standards, code reviews, source control management, build processes, testing, and operations experience
  • Knowledge of Machine Learning and LLM fundamentals, including transformer architecture, training/inference lifecycles, and optimization techniques
  • Knowledge of computer architecture, operating systems, and parallel computing
  • Strong proficiency in C/C++
  • Strong Linux systems knowledge
  • Experience developing compute kernels for GPUs, DSPs, or custom accelerators
  • Proven track record of owning and delivering complex software features end-to-end

Preferred Qualifications

  • Knowledge of ML frameworks including JAX, PyTorch, vLLM, SGLang, Dynamo, TorchXLA, and TensorRT
  • Experience in developing and deploying LLMs in production on GPUs, Neuron, TPU or other AI acceleration hardware, or experience with CUDA kernels or ML/low-level kernels
  • Familiarity with speculative decoding, KV cache optimization, or other LLM serving optimizations
  • Experience with distributed systems - collective communication, RDMA, or high-speed interconnect programming
  • Experience with hardware simulation environments and model validation workflows
  • Demonstrated early adopter of AI-assisted development tools - uses LLMs or code-generation agents as part of daily workflow

Our inclusive culture empowers Amazonians to deliver the best results for our customers. If you have a disability and need a workplace accommodation or adjustment during the application and hiring process, including support for the interview or onboarding process, please visit https://amazon.jobs/content/en/how-we-hire/accommodations for more information. If the country/region you’re applying in isn’t listed, please contact your Recruiting Partner.


Company - Annapurna Labs Ltd.

Job ID: A10420847
Amazon Web Services (AWS)