Senior ML Embedded Engineer

No longer accepting applications

Overview

Job TypeHybrid

Experience5 years

Job PositionEmbedded

UpdatedOct 16, 2025

LocationTel Aviv District

SalaryN/A

Skills

C ꞏ 5y
C++ ꞏ 5y
Python
Computer Vision
AI pipelines
Memory-constrained programming
Performance optimization
Processor architecture
Vision-language models
Transformer architectures
RT-Embedded
Quantization
PyTorch Profiler
Pruning
OpenCL
NVIDIA Nsight
Kernel fusion
Hardware-specific SDKs
DSP
CUDA

Senior ML Embedded Engineer

Location: Ramat Hahayal, Tel Aviv

Employment Type: Full-time

Company: GSI Technology – A publicly traded, international high-tech company (NASDAQ: GSIT) developing the cutting-edge Gemini® Associative Processing Unit (APU) for computer-in-memory acceleration.

GSI is pioneering the Gemini APU—a cutting-edge, game-changing processor designed to accelerate compute-intensive tasks like large language models, machine learning, advanced image processing, and radar imaging.

If you're passionate about architecting high-performance software systems, implementing advanced algorithms, and drilling into low-level technical details, this is the role for you.

We’re seeking a dynamic and fast-learning engineer with a passion for diving deep into large language model implementations, and a keen focus on performance optimization and efficient execution.

Position Overview

We are seeking a highly skilled and motivated Senior ML Embedded Software Engineer to lead the development and optimization of AI models — including Large Language Models (LLMs) and Vision Language Models (VLM;s) — on GSI’s proprietary APU. This role bridges high-level machine learning understanding with low-level system and performance engineering, primarily in Python ,C and C++. You will be responsible for architecting, implementing, and optimizing AI pipelines under hardware constraints, with a strong emphasis on computer vision and transformer architectures.

Key Responsibilities

Develop and optimize software libraries for CNNs, LLM’s and VLM implementations on embedded hardware.
Design end-to-end system flows integrating AI models, especially in computer vision domains.
Lead performance tuning efforts under constraints such as memory, compute, and latency.
Work closely with hardware teams to co-design software optimized for GSI’s APU.
Debug and optimize AI inference pipelines, including Python-based pre/post-processing where applicable.
Team up across disciplines to turn wild ideas into reliable, high-performance code.
Architect and develop a high-performance AI compiler framework for deploying quantized neural networks on the GSI Gemini edge platform, enabling advanced edge AI workloads and optimizing for low-latency inference, efficient hardware utilization, and seamless integration with hardware acceleration pipelines.

Required Qualifications

B.Sc. in Computer Science or Electrical Engineering from a leading university.
5+ years of experience in embedded software development using C++ and C.
Solid experience in one or more of the following: Computer Vision, RT-Embedded, DSP.
Proven experience in developing and optimizing AI pipelines under performance, memory, and latency constraints.
Proven track record in performance/memory-constrained programming.
Strong communication skills, analytical mindset, and attention to detail.
Independent, solution-oriented, and highly motivated to make things happen
Proven track record developing and optimizing software algorithms with deep consideration for hardware architecture, memory bandwidth, and system constraints
Strong understanding of processor architecture fundamentals—caches, pipeline stages, execution units, and memory hierarchies
Ability to interpret detailed hardware specifications and translate them into robust, efficient software solution.

Preferred Qualifications

Practical experience with transformer architectures and/or vision-language models (VLMs).
Deep knowledge of computer vision pipelines and multimodal systems.
Experience designing complex software systems from concept to deployment.
Familiarity with hardware-aware optimization techniques such as:
Quantization
Pruning
Kernel fusion
Experience with performance profiling tools (e.g., PyTorch Profiler, NVIDIA Nsight).
Low-level optimization experience with CUDA, OpenCL, or hardware-specific SDKs.

Privacy Statement

All applications will be handled with strict confidentiality. Your information will not be shared without your consent.

GSI Technology

Similar jobs

C++ Developer

Kiryat BialikMar 02, 2026
Software Engineer

Rosh HaAyinMar 01, 2026
RT Embedded Engineer

Ness ZionaFeb 26, 2026
Embedded Software Engineer

HerzliyaFeb 23, 2026
Software Engineer

Hod HaSharonFeb 12, 2026
Embedded Software Engineer

HerzliyaFeb 22, 2026
Senior Real-Time Embedded Engineer (527894)

Tel Aviv DistrictFeb 09, 2026
מהנדס/ת פיתוח תוכנת Embedded / Real Time

Beer YaakovMar 02, 2026

Your Account

Your Account