DevJobs

Senior Data Engineer

Overview
Skills
  • Python Python ꞏ 3y
  • SQL SQL
  • Shell Shell
  • Linux Linux
  • AWS AWS
  • Docker Docker
  • Kubernetes Kubernetes
  • Airflow Airflow
  • ETL
  • Glue
  • S3
  • Athena
  • EMR
  • Apache Spark

Bigabid is an innovative technology company led by data scientists and engineers devoted to mobile app growth. Our proprietary ad platforms, powered by machine learning, are the result of that devotion.

We deliver valuable results and insights for a fast-growing clientele of major app developers using elite programmatic user acquisition and retargeting technologies. Our ever-evolving, state-of-the-art machine learning technology analyzes tens of TB of raw data per day to produce millions of ad recommendations in real time. 

As a Senior Data Engineer, you will be a key contributor to building and scaling Bigabid’s petabyte-scale data platform. You will design and maintain high-performance data pipelines that process tens of terabytes of data daily using Apache Spark and other modern big data technologies.

We are looking for engineers with proven experience working with large-scale datasets and distributed systems, who are passionate about data quality, performance, and system reliability.

Responsibilities:

  • Design and implement scalable, fault-tolerant ETL pipelines using Apache Spark for high-throughput, real-time and batch data processing.
  • Develop and manage CI/CD pipelines, testing strategies, and data quality frameworks to ensure robust data workflows.
  • Collaborate with data scientists, analysts, and product teams to build data models, maintain data lineage, and surface insights that drive business value.
  • Evaluate and integrate new technologies to enhance performance, scalability, and cost-efficiency in our data ecosystem.
  • Own and evolve critical components of our data infrastructure, with a deep understanding of both technical architecture and business context.



Requirements:


  • 6+ years of hands-on experience as a Data Engineer or Backend Engineer, with a strong focus on data-intensive systems.
  • Mandatory: Proven, production-grade experience with Apache Spark at scale (tens of terabytes daily or more).
  • 3+ years of experience python.
  • Experience with cloud-native architectures, especially AWS (e.g., S3, EMR, Athena, Glue).
  • Expertise in designing and maintaining ETL pipelines using orchestration tools like Airflow (or similar).
  • Strong analytical skills and proficiency in SQL.
  • Experience working in distributed environments and optimizing performance for large-scale data workloads.
  • Ability to lead initiatives and work independently in a fast-paced environment.

Nice to Have:

  • Tech lead, project management or team leadership background
  • Familiarity with Linux-based systems and shell scripting.
  • Knowledge of containerization technologies like Docker and/or Kubernetes.



Excerpt:


Join a new team that will be responsible for the management of huge amounts of data at a Petabyte scale, creation and management of enterprise grade systems, and flows that process ~50TB of data every day and growing.

Bigabid