About Harmonya:
Retailers and CPG brands rely on product data to make critical decisions, but outdated systems limit what’s possible. Harmonya changes that.
Our AI-powered solutions transform fragmented, inconsistent product data into a dynamic, structured, and enriched source of truth. By analyzing trillions of alternative data points, we help leading CPGs and retailers—including Coca-Cola, Nestle, PepsiCo, and more—gain deeper insights, improve product discovery, and make smarter, faster decisions.
Founded in 2021, Harmonya is backed by investors including Bright Pixel, Team8, Susa Ventures, J Ventures, and others.
________________________________________________
We're seeking talented data engineers to join our rapidly growing team, which includes senior software and data engineers. Together, we drive our data platform from acquisition and processing to enrichment, delivering valuable business insights. Join us in designing and maintaining robust data pipelines, making an impact in our collaborative and innovative workplace.
Responsibilities
- Design, implement, and optimize scalable data pipelines for efficient processing and analysis.
- Build and maintain robust data acquisition systems to collect, process, and store data from diverse sources.
- Collaborate with DevOps, Data Science, and Product teams to understand needs and deliver tailored data solutions.
- Monitor data pipelines and production environments proactively to detect and resolve issues promptly.
- Apply best practices for data security, integrity, and performance across all systems.
Requirements
- 4+ years of experience in data or backend engineering, with strong proficiency in Python for data tasks.
- Proven track record in designing, developing, and deploying complex data applications.
- Hands-on experience with orchestration and processing tools such as Apache Airflow and Apache Spark.
- Bachelor’s degree in Computer Science, Information Technology, or a related field — or equivalent practical experience.
- Ability to perform under pressure and make strategic prioritization decisions in fast-paced environments.
- Excellent communication skills and a strong team player, capable of working cross-functionally.
Advantage:
- Familiarity with data science tools and libraries (e.g., pandas, scikit-learn).
- Experience with public cloud platforms — preferably GCP — and cloud-native data services
- Experience working with Docker and Kubernetes.
- Hands-on experience with CI tools such as GitHub Actions.