LSports is a leading global provider of sports data, dedicated to revolutionizing the industry through innovative solutions. We excel in sports data collection and analysis, advanced data management, and cutting-edge services like AI-based sports tips and high-quality sports visualization. As the sports data industry continues to grow, LSports remains at the forefront, delivering real-time solutions.
If you share our love of sports and tech, you've got the passion and will to better the sports-tech and data industries - join the team!
Responsibilities:
- Build the foundations of LSports’ modern data architecture, supporting real-time, high-scale (Big Data) sports data pipelines and ML/AI use cases, including Generative AI.
- Map the company’s data needs and lead the selection and implementation of key technologies across the stack: data lakes (e.g., Iceberg), databases, ETL/ELT tools, orchestrators, data quality and observability frameworks, and statistical/ML tools.
- Design and build a cloud-native, cost-efficient, and scalable data infrastructure from scratch, capable of supporting rapid growth, high concurrency, and low-latency SLAs (e.g., 1-second delivery).
- Lead design reviews and provide architectural guidance for all data solutions, including data engineering, analytics, and ML/data science workflows.
- Set high standards for data quality, integrity, and observability. Design and implement processes and tools to monitor and proactively address issues like missing events, data delays, or integrity failures.
- Collaborate cross-functionally with other architects, R&D, product, and innovation teams to ensure alignment between infrastructure, product goals, and real-world constraints.
- Mentor engineers and promote best practices around data modeling, storage, streaming, and observability.
- Stay up-to-date with industry trends, evaluate emerging data technologies, and lead POCs to assess new tools and frameworks — especially in the domains of Big Data architecture, ML infrastructure, and Generative AI platforms.
Requirements:
- At least 10 years of experience in a data engineering role, including 2+ years as a data architect with ownership over company-wide architecture decisions.
- Proven experience designing and implementing large-scale, Big Data infrastructure from scratch in a cloud-native environment (GCP preferred).
- Excellent proficiency in data modeling, including conceptual, logical, and physical modeling for both analytical and real-time use cases.
- Strong hands-on experience with:
- Data lake and/or warehouse technologies, with Apache Iceberg experience required (e.g., Iceberg, Delta Lake, BigQuery, ClickHouse)
- ETL/ELT frameworks and orchestrators (e.g., Airflow, dbt, Dagster)
- Real-time streaming technologies (e.g., Kafka, Pub/Sub)
- Data observability and quality monitoring solutions
- Excellent proficiency in SQL, and in either Python or JavaScript.
- Experience designing efficient data extraction and ingestion processes from multiple sources and handling large-scale, high-volume datasets.
- Demonstrated ability to build and maintain infrastructure optimized for performance, uptime, and cost, with awareness of AI/ML infrastructure requirements.
- Experience working with ML pipelines and AI-enabled data workflows, including support for Generative AI initiatives (e.g., content generation, vector search, model training pipelines) — or strong motivation to learn and lead in this space.
- Excellent communication skills in English, with the ability to clearly document and explain architectural decisions to technical and non-technical audiences.
- Fast learner with strong multitasking abilities; capable of managing several cross-functional initiatives simultaneously.
Advantage:
- Experience leading POCs and tool selection processes.
- Familiarity with Databricks, LLM pipelines, or vector databases is a strong plus.