DevJobs

Senior Big Data Engineer

Overview
Skills
  • Python Python ꞏ 5y
  • SQL SQL
  • Scala Scala
  • Spark Spark
  • Redis Redis
  • AWS AWS
  • GCP GCP
  • Kubernetes Kubernetes
  • Grafana Grafana
  • PySpark
  • Aerospike
  • Bigtable
  • Dask
  • Kibana
  • OCI

Start.io is a mobile marketing and audience platform. Start.io empowers the mobile app ecosystem and simplifies mobile marketing, audience building, and mobile monetization. Start.io 's direct integration with over 500,000 monthly active mobile apps provides access to unprecedented levels of global first-party data, which can be leveraged to understand and predict behaviors, identify new opportunities, and fuel growth.

We’re looking for a highly technical, independent, and visionary Big Data Engineer to take ownership of our next-generation distributed training pipelines and infrastructure. This is a hands-on, high-impact role in the core of our algorithmic decision-making systems, shaping how models are trained and deployed at a scale across billions of data points in real-time AdTech environments.

You’ll be responsible for designing and building scalable ML systems from the ground up from data ingestion to model training to evaluation. You'll work closely with Algo researchers, data engineers, and production teams to drive innovation and performance improvements throughout the lifecycle.


What will you do?

  • Design and implement large-scale, distributed ML training pipelines
  • Build scalable infrastructure for data preprocessing, feature engineering, and model evaluation
  • Lead the technical design and development of new ML systems: from architecture to production
  • Collaborate cross-functionally with DS, infra teams, Product, BA and Engineering teams to define and deliver impactful solutions
  • Own the full lifecycle of ML infra: tooling, versioning, monitoring, automation, measuring results and quickly responding to critical issues.
  • Continuously research and adopt best-in-class practices in MLOps, performance tuning, and distributed systems


Requirements:

  • B.Sc. or M.Sc. in Computer Science, Software Engineering, or other equivalent fields.
  • 5+ years of hands-on experience in backend or ML engineering
  • Strong Python skills and experience working with distributed systems and parallel data processing frameworks such as Spark (using PySpark or Scala), Dask, or similar technologies. Familiarity with Scala is a strong advantage, especially in performance-critical environments.
  • Proven track record in designing and scaling ML infrastructure
  • Deep understanding of ML workflows and lifecycle management
  • Experience in cloud environments (AWS, GCP, OCI) and containerized deployment (Kubernetes)
  • Understanding databases and SQL for data retrieval.
  • Strong communication skills and ability to drive initiatives independently
  • A passion for clean code, elegant architecture, and measurable impact
  • Monitoring and alerting tools (e.g. Grafana, Kibana)
  • Experience working with in-memory and NoSQL databases (e.g. Aerospike, Redis, Bigtable) to support ultra-fast data access in production-grade ML services


Why join us?

  • Be the technical lead on a mission-critical domain, shaping how billions of decisions are made daily
  • Join a top-tier Algo team in one of the most data-intensive industries
  • Work on systems with massive scale and real-time complexity
  • Enjoy a flat, fast-moving culture with freedom to lead and execute.
Start.io