Data Engineer

No longer accepting applications

Overview

Job TypeHybrid

Experience5 years

Job PositionData & Analytics

UpdatedSep 09, 2024

LocationTel Aviv District

SalaryN/A

Skills

SQL ꞏ 5y
Python ꞏ 5y
Scala
Java
Kafka ꞏ 3y
RDBMS ꞏ 5y
Design Patterns
Kubernetes
Apache Spark ꞏ 5y
Trino ꞏ 5y
Presto ꞏ 5y
Non-relational databases ꞏ 5y
Lakehouse ꞏ 5y
ETL ꞏ 5y
Data warehouse ꞏ 5y
Cloud Functions ꞏ 3y
Vertex ꞏ 3y
Task queues ꞏ 3y
Sub ꞏ 3y
Streaming technologies ꞏ 3y
Stream processing technologies ꞏ 3y
Asynchronous programming ꞏ 3y
Spark Streaming ꞏ 3y
Sagemaker ꞏ 3y
Pub ꞏ 3y
Kubeflow ꞏ 3y
Iceberg
Software engineering concepts
Complex data sets
Data modeling
Paimon
Distributed systems
Google Cloud Dataproc
Amazon EMR
Unstructured data

About Us:

We, at AUI are excited to introduce you to Apollo. Apollo is our breakthrough language model, built with a neuro-symbolic architecture to make conversational agents possible. Apollo enables the native tool use and controllability transformer-based agents lack. Apollo unlocks fine-tuning for agents, allowing continuous evolution from human feedback and ever-improving performance for conversational agents of any kind. We, at AUI, are seeking an experienced R&D Manager to lead our Research and Development team, driving the development of innovative products and features.

Who are you?

You are a seasoned Data Engineer with a deep understanding of data modeling, massive parallel processing (in both real-time and batch) and bringing Machine learning capabilities into large-scale production systems. You have experience at a cutting-edge startup and are passionate about building the data infrastructures that fuel the world’s first intelligent agent. You are a team player with excellent collaboration, communication skills, and a “can-do” approach.

What you’ll be doing?

Build, maintain, and scale data pipelines for both batch and real-time data processing across multiple sources and ecosystems.
Design and implement robust APIs and integrate diverse data systems to support data collection and aggregation.
Develop and manage advanced data architectures, including lakehouses, streamhouses, and data warehouses.
Collaborate with data scientists and other stakeholders to implement effective data solutions and integrate large language models (LLMs) into our systems.
Work with cross-functional teams to define business needs and translate them into technical implementations that leverage your deep understanding of data architectures and software engineering best practices.
Develop and lead initiatives to manage, monitor, and debug data systems, enhancing their reliability, efficiency, and overall quality.

What should you have?

5+ years of experience in designing and managing sophisticated lakehouse and data warehouse architectures, ensuring scalable, efficient, and reliable data storage solutions.
5+ years of experience building and maintaining ETLs using Apache Spark.
3+ years of experience working with streaming technologies (e.g., Apache Kafka, Pub/Sub) and implementing real-time data pipelines using Stream processing technologies (e.g., Spark Streaming, Cloud Functions).
5+ years of experience with SQL and distributed query engines such as Presto and Trino, with a strong focus on analyzing and optimizing query plans to develop efficient and complex queries.
3+ years of experience developing APIs using Python, with proficiency in asynchronous programming and task queues.
Proven expertise in deploying and managing Spark applications on enterprise-grade platforms such as Amazon EMR, Kubernetes (K8S), and Google Cloud Dataproc.
Solid understanding of distributed systems and experience with open file formats such as Paimon and Iceberg.
3+ years of experience developing infrastructures that bring machine learning capabilities to production, using solutions such as Kubeflow, Sagemaker, and Vertex.
5+ years of experience writing production-grade Python code and working with both relational and non-relational databases.
Solid understanding of software engineering concepts, design patterns, and best practices, with the ability to architect solutions and integrate different system components.
Proven experience working with unstructured data, complex data sets, and data modeling.
Advantage – Demonstrated experience orchestrating containerized applications in AWS and GCP using EKS and GKE.
Advantage – Proficiency in Scala and Java.

AUI

Similar jobs

Senior Data Scientist

NetanyaApr 17, 2026
Senior Product Analyst

Tel Aviv-YafoMar 30, 2026
איש/ת DBA

AshdodMar 24, 2026
Data Analyst Lead – Outplayed

Ramat GanApr 09, 2026
Senior Data Engineer - Big Data & Scalable Pipelines

Modiin-Maccabim-ReutDec 23, 2025
Senior Data Engineer

Tel AvivMar 25, 2026
Software Engineer - Cloud, Falcon Cloud Security (Hybrid, ISR)

Tel Aviv-YafoMay 12, 2026
Senior Data & Business Analyst

Tel Aviv DistrictMay 11, 2026

Your Account

Your Account

Data Engineer

Overview

Skills

Similar jobs

Senior Data Scientist

Senior Product Analyst

איש/ת DBA

Data Analyst Lead – Outplayed

Senior Data Engineer - Big Data & Scalable Pipelines

Senior Data Engineer

Software Engineer - Cloud, Falcon Cloud Security (Hybrid, ISR)

Senior Data & Business Analyst