DevJobs

Senior Data Engineer

Overview
Skills
  • Python Python
  • Bash Bash
  • SQL SQL
  • Spark Spark
  • NoSQL NoSQL
  • Neo4j Neo4j
  • MongoDB MongoDB
  • Linux Linux
  • GitHub GitHub
  • Jenkins Jenkins
  • AWS AWS
  • GCP GCP
  • Azure Azure
  • Docker Docker
  • Airflow Airflow
  • Terraform Terraform
  • Ansible Ansible
  • NiFi
  • Vector DBs
  • Redshift
  • Lambda
  • EKS
  • ECS
We are seeking an experienced Data Engineer to join our Platform and DevOps Engineering group. In this role, you will be pivotal in developing and maintaining Dream’s data lake, which is the backbone for data analytics, AI model training and—key components of our proprietary technology. This position is crucial as it involves handling vast amounts of sensitive data and ensuring its availability, accuracy, and security.

We are looking for an experienced Data Engineer who is passionate about building scalable data infrastructures and has a keen interest in leveraging data to drive significant business impact. The job involves creating robust data solutions that support both batch and real-time data processing in cloud and on-prem environments.

Responsibilities:

  • Architect, build, and maintain a secure and scalable data lake that integrates smoothly with our data pipelines.
  • Design and implement processes for data modeling, mining, and production.
  • Work closely with data scientists and cyber researchers to ensure seamless data availability for AI model training.
  • Develop and optimize data ingestion, storage, and retrieval processes to meet the needs of our high-throughput AI platforms.
  • Build tools for data validation, cleansing, and automation to enhance data integrity and efficiency.
  • Troubleshoot and resolve issues in our development, production, and testing environments related to data access and quality.
  • Demonstrate excellent communication and interpersonal skills, working effectively as part of a dynamic team.

Skills:

  • Proven ability to work collaboratively in a team environment.
  • At least 4-5 years of experience in data engineering, particularly with data lakes and large-scale data platforms.
  • Proficient with AWS cloud services especially those related to data storage and processing.
  • Expertise in big data technologies.
  • Strong knowledge of SQL/NoSQL databases, data warehouse solutions, and data management tools.
  • Solid background in data pipeline and workflow management tools like Airflow and NiFi.
  • Skilled in Linux environments, with scripting experience in Python or Bash.
  • Familiarity with CI/CD pipelines and version control systems such as GitHub.

Advantages:

  • Experience with building data lake solutions and data scraping.
  • Experience in setting up real-time data feeds and stream-processing systems.
  • Exposure to machine learning and understanding of data science workflows.
  • Experience with vector databases.

Tech stack:

  • AWS, Google Cloud, Azure, Spark, Airflow, NiFi, Vector DBs, Jenkins, Docker, EKS, Lambda, ECS, Redshift, Terraform, Ansible, MongoDB, Neo4J, Python, Bash, GitHub, and more.
Dream Security