DevJobs

Data Engineer - AI Infra Group

Overview
Skills
  • Python Python ꞏ 3y
  • SQL SQL ꞏ 3y
  • Spark Spark
  • PostgreSQL PostgreSQL
  • MongoDB MongoDB
  • Elasticsearch Elasticsearch
  • AWS AWS
  • Airflow Airflow
  • dbt
  • EC2
  • EKS
  • LangGraph
  • LlamaIndex
  • S3
  • vector databases
Dream is a pioneering AI cybersecurity company delivering revolutionary defense through artificial intelligence. Our proprietary AI platform creates a unified security system safeguarding assets against existing and emerging generative cyber threats. Dream's advanced AI automates discovery, calculates risks, performs real-time threat detection, and plans an automated response. With a core focus on the "unknowns," our AI transforms data into clear threat narratives and actionable defense strategies.

Dream's AI cybersecurity platform represents a paradigm shift in cyber defense, employing a novel, multi-layered approach across all organizational networks in real-time. At the core of our solution is Dream's proprietary Cyber Language Model, a groundbreaking innovation that provides real-time, contextualized intelligence for comprehensive, actionable insights into any cyber-related query or threat scenario.

We are looking for a Data Engineer who thrives at the intersection of scalable infrastructure and intelligent systems. Someone who can turn fragmented data into adaptive, self-serve pipelines that power both human analysts and autonomous agents.

Responsibilities:

  • Design and maintain agentic data pipelines that adapt dynamically to new sources, schemas, and AI-driven tasks
  • Build self-serve data systems that allow teams to explore, transform, and analyze data with minimal engineering effort
  • Develop modular, event-based pipelines across AWS environments, combining cloud flexibility with custom open frameworks
  • Automate ingestion, enrichment, and fusion of cybersecurity data including logs, configs, and CTI streams
  • Collaborate closely with AI engineers and researchers to operationalize LLM and agent pipelines within the CLM ecosystem
  • Implement observability, lineage, and data validation to ensure reliability and traceability
  • Scale systems to handle complex, high-volume data while maintaining adaptability and performance
  • Own the data layer end-to-end including architecture, documentation, and governance

Skills:

  • 3+ years of experience as a Data Engineer or Backend Data Developer
  • Strong experience with Python, SQL, and modern data frameworks such as Airflow, Spark, or dbt
  • Practical understanding of LLM pipelines or agent orchestration frameworks (LangGraph, LlamaIndex, or similar)
  • Familiarity with various database systems such as Postgres, MongoDB, Elasticsearch, and vector databases
  • Experience building scalable data systems in AWS (EC2, S3, EKS)
  • Solid grasp of data modeling, schema evolution, and pipeline observability
  • Experience working closely with AI and ML teams
  • Strong problem-solving mindset, curiosity, and the ability to move fast while keeping systems clean and reliable
Dream Security