DevJobs

Data Platform / Automation Engineer

Overview
Skills
  • Python Python
  • Java Java
  • Scala Scala
  • AWS AWS
  • Kubernetes Kubernetes
  • Apache Spark
  • EC2
  • EKS
  • EMR
  • Iceberg
  • S3

DualBird is building a next-generation data acceleration platform that brings hardware-level performance to cloud infrastructure with software-level simplicity.


We’re looking for a Data Platform / Automation Engineer to build, operate, and automate realistic, end-to-end data pipelines that validate our framework under true customer-like conditions.


This role fits engineers with hands-on experience building and operating large-scale Spark pipelines, coming from data engineering, automation, or data-focused DevOps backgrounds.


The role

You will design and automate full-scale data pipelines deployed across multiple execution environments on AWS, including EMR, Kubernetes (EKS), and additional Spark deployment models. These pipelines are used to validate correctness, performance, and stability of DualBird’s acceleration framework using Iceberg-based data lakes and customer-like architectures.


You’ll work closely with high-level software engineers, low-level system engineers, and architects to improve the overall system, using real workload behavior as direct feedback into product design and architecture.


What you’ll do

  • Build and operate end-to-end data pipelines that mirror real customer architectures.
  • Automate Spark-based pipelines into repeatable system validation and benchmarking workflows.
  • Run and analyze large-scale Spark workloads, focusing on performance, scalability, and failure modes.
  • Debug issues across Spark, cloud infrastructure, and the acceleration layer to identify root causes.
  • Collaborate closely with high-level software, low-level system, and architecture teams to drive framework and system improvements.
  • Develop automation tools and frameworks (primarily in Python) for orchestrating and validating data workloads.


What you bring

  • 4–7+ years of experience as a Data Engineer, Data Platform Engineer, or Automation Engineer working with data pipelines.
  • Strong hands-on experience with Apache Spark and distributed data processing systems.
  • Experience building automated data workflows or system validation frameworks.
  • Solid programming skills in Python (Scala or Java is a plus).
  • Familiarity with AWS-based data environments (e.g., S3, EMR, EKS, EC2).
  • Ability to reason about system-level behavior, performance, and reliability.
  • Strong communication skills and comfort working across multiple engineering disciplines.


Dual Bird Technologies