DevJobs

Senior Data Infrastructure Engineer

Overview
Skills
  • Python Python
  • Java Java
  • Go Go
  • Rust Rust
  • Scala Scala
  • Kafka Kafka
  • Flink Flink
  • InfluxDB InfluxDB
  • Elasticsearch Elasticsearch
  • Cassandra Cassandra
  • DynamoDB DynamoDB
  • AWS AWS
  • Azure Azure
  • GCP GCP
  • Docker Docker
  • Kubernetes Kubernetes
  • Helm
  • Istio
  • Linkerd
  • Terraform Terraform
  • Grafana Grafana
  • Airflow Airflow
  • Hadoop
  • Amazon S3
  • Kinesis
  • Azure Data Lake
  • Pulsar
  • ClickHouse
  • Redpanda
  • Druid
  • TimescaleDB
  • GCS
  • Jaeger
  • Apache Atlas
  • Apache Hudi
  • OpenSearch
  • Apache Iceberg
  • OpenTelemetry
  • ArgoCD
  • ORC
  • Arrow
  • Parquet
  • Avro
  • Prometheus Prometheus
  • Prefect
  • Pulumi
  • CloudFormation
  • Dagster
  • Ranger
  • DataHub
  • Delta Lake
  • ScyllaDB
  • Solr
  • Flux

About the company:


Cybereason is seeking a passionate and skilled Information Security Engineer to join our team. In this role, you’ll be responsible for enhancing the security of our enterprise environment, working with a variety of security tools and technologies. We're looking for individuals with a strong background in information security who are eager to help shape the future of cybersecurity and protect our organization from evolving threats.


About the Role:

  • Cybereason is seeking a Senior Data Infrastructure Engineer to architect and scale the data backbone that powers our cutting-edge cybersecurity analytics. In this role, you’ll build distributed systems that process billions of security events daily, powering our platform with real-time and historical threat intelligence. You’ll work at the intersection of big data, cloud-native engineering, and cybersecurity, ensuring our infrastructure can support advanced analytics and machine learning at scale.
  • Key Responsibilities:
  • Design and develop petabyte-scale data infrastructure and real-time streaming systems capable of processing billions of events daily
  • Build and optimize high-throughput, low-latency data pipelines for security telemetry
  • Architect distributed systems using cloud-native technologies and microservices patterns
  • Design and maintain data lakes, time-series databases, and analytical stores optimized for security use cases
  • Implement robust data governance, quality, and monitoring frameworks across all data flows
  • Continuously optimize for performance, scalability, and cost-efficiency in large-scale data workloads
  • Collaborate with data science and security teams to enable advanced analytics and ML capabilities
  • Ensure data infrastructure complies with strict security, availability, and compliance requirements
  • Required Qualifications:
  • Bachelor’s degree in Computer Science, Engineering, or related field
  • 7+ years of experience building and maintaining large-scale data infrastructure
  • Proven experience operating petabyte-scale systems processing billions of records per day
  • Expert-level proficiency with stream processing: Apache Flink, Kafka, Pulsar, Redpanda, Kinesis
  • Deep experience with analytical and time-series databases: ClickHouse, Druid, InfluxDB, TimescaleDB
  • Familiarity with distributed storage: Hadoop (HDFS), Amazon S3, GCS, Azure Data Lake
  • Strong skills in: Rust, Go, Scala, Java, or Python for high-performance systems
  • Cloud expertise: AWS (EMR, Redshift, Kinesis), GCP (Dataflow, BigQuery, Pub/Sub), or Azure equivalents
  • Solid experience with Kubernetes, Docker, and Helm; familiar with service mesh like Istio or Linkerd
  • Strong grasp of data lake/lakehouse architectures and modern data stack tools
  • Preferred Qualifications:
  • Experience with Apache Iceberg, Delta Lake, or Apache Hudi
  • Familiarity with Airflow, Prefect, or Dagster for orchestration
  • Knowledge of search platforms: Elasticsearch, OpenSearch, or Solr
  • Experience with NoSQL: Cassandra, ScyllaDB, or DynamoDB
  • Familiar with columnar formats: Parquet, ORC, Avro, Arrow
  • Experience with observability stacks: Prometheus, Grafana, Jaeger, OpenTelemetry
  • Familiar with Terraform, Pulumi, or CloudFormation for IaC
  • GitOps tools: ArgoCD, Flux for automated deployments
  • Exposure to data mesh, data governance, and metadata tooling (Apache Atlas, Ranger, DataHub)
  • Background in cybersecurity, SIEM, or security analytics platforms
  • Familiarity with ML infrastructure and MLOps best practices
  • Technical Skills and Knowledge:
  • Stream Processing: Real-time analytics, windowing, state management, exactly-once semantics
  • Distributed Systems: Partitioning, consistency, HA, failover, load balancing
  • Data Lakes & Lakehouses: Multi-zone design, schema evolution, metadata management
  • Cloud-Native Patterns: Microservices, event-driven design, auto-scaling, regional failover
  • Performance Tuning: Query optimization, resource allocation, caching, compression
  • Governance: Lineage tracking, anomaly detection, quality controls, regulatory compliance
  • Security: Encryption, zero-trust principles, access control, audit logs
  • Observability: Metrics, logs, distributed tracing, alerting
  • Key Competencies:
  • Proven track record of building and scaling high-volume, high-throughput data systems
  • Strong analytical and problem-solving skills in complex distributed environments
  • Excellent communication and collaboration across cross-functional teams
  • Self-driven with ability to manage multiple high-impact infrastructure initiatives
  • Passionate about data architecture and staying ahead of emerging tech
  • Experience mentoring engineers and shaping technical direction
  • What We Offer:
  • Work on cutting-edge cybersecurity technology
  • Collaborative and innovative environment
  • Continuous learning opportunities
  • Competitive salary and benefits

#LI-Remote

Cybereason