DevJobs

Site Reliability Engineer

Overview
Skills
  • Python Python
  • Go Go
  • Bash Bash
  • ML ML
  • Linux Linux
  • CI/CD CI/CD
  • AWS AWS
  • Kubernetes Kubernetes
  • Terraform Terraform
  • Grafana Grafana
  • Datadog
  • Prometheus Prometheus
  • EC2
  • EKS
  • Predictive Analytics
  • RDS
  • S3

We are looking for a Site Reliability Engineer (SRE) to join our Engineering team. Someone who has a passion for observability, monitoring, automation, and high-availability systems, and who has a desire to solve complex technological challenges with a proactive approach to continuous improvement.

We use an interesting and mixed technology stack: Kubernetes, Terraform, CI/CD pipelines, Datadog, Prometheus, and cloud-native architectures.

In this position, you will use your expertise in building and scaling SRE operations, and will design, implement, and operate a world-class reliability strategy.



About U

sCheck Point is a key player the network security field, striving to provide the leading SASE platform in the market. Our innovative approach, merging cloud and on-device protection, redefines how businesses connect in the era of cloud and remote work


.
Major Responsibiliti


  • es
    Develop and maint
    ain our monitoring, alerting, and logging systems, ensuring high visibility into production environmen
  • ts.Implement automation to improve system reliability, scalability, and efficien
  • cy.Troubleshoot and resolve production incidents, leading root cause analyses and implementing permanent fix
  • es.Collaborate with software engineers and DevOps teams to enhance application performance and resilien
  • ce.Continuously improve operational processes, focusing on reducing toil and improving reliabili


ty.
Desired Backgr


  • ound
    3+
    years of experience as an SRE, DevOps Engineer, or in a similar
  • role.Hands-on experience with monitoring and observability tools like Datadog, Prometheus, and Gra
  • fana.Strong understanding of Linux systems, networking, and cloud-native architect
  • ures.Experience with Kubernetes, Terraform, and CI/CD pipel
  • ines.A problem solver, capable of finding creative solutions and getting things
  • done.Fluent with incident management, RCA processes, and operational best pract

ices.It would be great if you also

  • have:Experience in high-scale distributed sys
  • tems.Background in security and compliance for cloud infrastruc
  • ture.Familiarity with AWS (EKS, EC2, RDS, S3, networking configurati
  • ons).Proficiency in Python, Go, or Bash for automation and scrip
  • ting.Understanding of cost optimization and resource management in cloud environm
  • ents.Familiarity with machine learning or predictive analytics for proactive reliability manage


ment.
Check Point Software Technologies