DevJobs

Senior C++ Engineer – AI Inference & Runtime Performance

Overview
Skills
  • C C
  • C++ C++
  • low-level systems
  • vector processing
  • SIMD
  • real-time systems
  • performance profiling
  • memory hierarchy
  • embedded systems
  • DSP
  • debugging
  • caches
  • LLMs
  • AI accelerator programming
  • Neural network inference optimization
  • OpenCL
  • CUDA
  • Compiler development
  • SDK development
  • signal processing
  • CNNs
  • toolchain development
  • Vision Transformers

Senior C++ Engineer – AI Inference & Runtime Performance | Tel Aviv

Most AI roles ask you to call PyTorch APIs.

This one asks you to understand what's happening beneath them.

At GSI Technology (NASDAQ: GSIT), we're building the Gemini® APU — a compute-in-memory processor purpose-built to accelerate LLMs, vision models, and advanced signal processing with fundamentally different silicon.

There's no existing playbook for what we're doing. If that excites you, keep reading.

🔍 Why this role is different

The gap between modern AI models and novel hardware doesn't close itself.

You'll be the engineer who closes it — by working where C++, computer architecture, and real AI workloads intersect at a level most engineers never reach.

You won't be fine-tuning models.

You'll be deciding how they execute on hardware that didn't exist two years ago.

⚙️ What you'll build

  • Implement and optimize AI workloads in modern C++ — from Python reference models to high-performance runtime implementations
  • Adapt LLMs, CNNs, and Vision Transformers to a unique compute-in-memory execution model
  • Identify and eliminate memory access, latency, and throughput bottlenecks
  • Design software libraries and runtime infrastructure for AI execution on novel silicon
  • Work directly with Hardware Architects — and shape future chip capabilities through software-driven insights

✅ What we need

  • B.Sc. in Computer Science, Electrical Engineering, or equivalent
  • 6+ years of software development experience
  • Strong C/C++ — you're comfortable thinking in cache lines and memory access patterns
  • Solid grasp of: memory hierarchy, caches, SIMD/vector processing, performance profiling
  • Experience close to hardware: embedded systems, DSP, low-level or real-time systems
  • Sharp debugger. Fast learner. High ownership.

⭐ Strong bonus if you bring

  • Neural network inference optimization experience
  • CUDA, OpenCL, or AI accelerator programming
  • Compiler, SDK, or toolchain development
  • Familiarity with LLMs, Vision Transformers, or CNNs
  • DSP / signal processing background


📍 Tel Aviv, Ramat Hahayal | Full-Time

💰 Competitive compensation + (NASDAQ: GSIT)

Curious about the technical problem before you're ready to apply?

Reach out — we're happy to start with a conversation.

GSI Technology