AI Accelerator Software Engineer

AI Accelerator Software Engineer – Silicon Software & Low-Level AI

Overview

Job TypeHybrid

Experience6 years

Job PositionEmbedded

UpdatedJun 22, 2026

LocationTel Aviv District

SalaryN/A

Skills

C ꞏ 6y
C++ ꞏ 6y
Assembly
Accelerator ꞏ 6y
Firmware ꞏ 6y
Embedded ꞏ 6y
Systems ꞏ 6y
Hardware-aware algorithm optimization
Memory hierarchies
Parallel execution
DMA
Systems-oriented reasoning
Caches
Bit-level reasoning
Bandwidth optimization
Performance-debug
Profiling
NPU programming
Low-level programming
HIP
Tracing
GPU programming
FPGA programming
Firmware development
DSP programming
DSP algorithms
Driver development
Deep learning infrastructure
Custom accelerator programming
CUDA
Compute kernel development
AI inference optimization

AI Accelerator Software Engineer – Silicon Software & Low-Level AI

Most GPU engineers work within the limits of what NVIDIA decided.

Here, you decide the limits.

GSI Technology (NASDAQ: GSIT) is developing Gemini2 — an Associative Processing Unit built for ultra-low latency, high-parallelism AI execution. We're not building on top of someone else's stack. We're building the stack — and we need engineers who've been waiting for exactly this kind of problem.

🔬 The gap you'll close

Between modern AI models and novel compute-in-memory hardware lies a space that PyTorch can't see and CUDA can't reach — memory access patterns, DMA flows, instruction scheduling, and execution strategies that simply don't have a reference implementation yet.

That's your domain.

⚙️ What you'll build

Highly optimized compute kernels for Transformer inference, LLM/VLM execution, FFTs, OpenCV pipelines, and Edge AI workloads

Memory access patterns, DMA utilization, and instruction scheduling — tuned for silicon that didn't exist two years ago

Performance analysis pipelines using profilers, traces, and hardware analyzers — and then fix what you find

Benchmarking infrastructure, internal tooling, and testing frameworks

Work directly with Architecture, Compiler, and AI teams — your kernel-level decisions shape how the next version of the chip gets designed

✅ What we need

B.Sc./M.Sc. in CS, EE, or equivalent

6+ years in low-level C/C++: embedded, firmware, accelerator, systems, or performance-critical software

Deep understanding of:

Memory hierarchies, caches, DMA, and bandwidth optimization

Parallel execution and performance-critical code

Hardware-aware algorithm optimization

Bit-level and systems-oriented reasoning

⭐ Strong bonus if you bring

GPU / NPU / DSP / FPGA or custom accelerator programming

Assembly or low-level programming experience

Compute kernel, firmware, or driver development

AI inference optimization or deep learning infrastructure

Profiling, tracing, and performance-debug experience

🎯 You're likely a strong fit if you've ever...

Written CUDA or HIP kernels — and wanted to go deeper than the driver allows

Spent days hunting a 3% latency regression in embedded firmware and felt satisfied when you found it

Looked at a DMA controller spec and felt curious, not scared

Worked on DSP algorithms and wondered what it'd feel like to do it for AI workloads

Had opinions about both sides of a hardware/software interface

📍 Tel Aviv, Ramat Hahayal | Full-Time | Hybrid

💰 Competitive compensation + (NASDAQ: GSIT)

Not sure if your background is the right fit? Reach out— we'd rather have the conversation.

GSI Technology

Similar jobs

IP Design Verification Engineer

Petah TikvaApr 08, 2026
AI Accelerator Software Engineer

Tel Aviv DistrictJun 04, 2026
Firmware Architect

Petah TikvaApr 06, 2026
PCB Layout Team Leader

HerzliyaJun 18, 2026
Senior Chip Design Engineer, Nitro Chip Design Team

Tel Aviv-YafoJun 18, 2026
Manager, Firmware System Engineering

Yokneam IlitJun 17, 2026
FPGA Design Engineer

HaifaJun 17, 2026
Manager, Firmware Verification Infrastructure

Yokneam IlitJun 17, 2026

Your Account

Your Account