Where does this role fit in our vision?
Every role at our company is designed with a clear purpose—to integrate collective efforts into our shared success, functioning as pieces of a collective brain. At Lusha, data is at the heart of everything we do. Our data engineers are responsible for building and maintaining the core products that power our business. The data teams own and manage the pipelines that generate Lusha's assets, with a strong focus on optimizing key company KPIs.
We leverage cutting-edge technologies to build and scale our data infrastructure. In this role, you will work closely with Data Scientists and Data Analysts on a daily basis, collaborating to develop innovative data solutions that drive business impact.
What will you be responsible for?
- Solve Complex Business Problems with Scalable Data Solutions
- Develop and implement robust, high-scale data pipelines to power Lusha’s core assets.
- Leverage cutting-edge technologies to tackle complex data challenges and enhance business operations.
- Collaborate with Business Stakeholders to Drive Impact
- Work closely with Product, Data Science, and Analytics teams to define priorities and develop solutions that directly enhance Lusha’s core products and user experience.
- Build and Maintain a Scalable Data Infrastructure
- Design and implement scalable, high-performance data infrastructure to support machine learning, analytics, and real-time data processing.
- Continuously monitor and optimize data pipelines to ensure reliability, accuracy, and efficiency.
Requirements
Here’s what we need from you
- 3+ years of experience in designing and implementing server-side data solutions at scale.
- 4+ years of programming experience, preferably in Python and SQL, with a strong understanding of data structures and algorithms.
- Proven experience in implementing algorithmic solutions, data mining, and analytical methodologies to optimize data processing and insights.
- Proficiency in orchestration frameworks such as Airflow, Kubernetes, and Docker Swarm, ensuring seamless workflow automation and management.
- Experience with Data Lakes and Spark is a strong advantage, particularly for processing large-scale datasets.
- Familiarity with the AWS ecosystem (S3, Glue, EMR, Redshift) - nice to have
- Knowledge of tools such as Kafka, Databricks, and Jenkins is a plus.
- Hands-on experience working with various database engines, including:
- Relational databases (PostgreSQL, MySQL)
- Document storage (MongoDB)
- Key-value stores (Redis)
- Time-series databases (Elasticsearch, ClickHouse)
- AI-savvy: comfortable working with AI tools and staying ahead of emerging trends.