DevJobs

Senior Site Reliability Engineer

Overview
Skills
  • Go Go
  • Python Python
  • Kafka Kafka
  • Redis Redis
  • MongoDB MongoDB
  • MySQL MySQL
  • Linux Linux
  • Microservices Microservices
  • GitHub Actions GitHub Actions
  • Jenkins Jenkins
  • AWS AWS
  • Kubernetes Kubernetes
  • Networking Networking
  • Splunk
  • RabbitMQ RabbitMQ
  • New Relic
  • Lambda
  • ElastiCache
  • S3
  • EKS

Our Campus Engineering team is hiring a Senior Site Reliability Engineer in our Tel-Aviv office.


The Campus and On-Site portion of the Grubhub application simplifies the diner experience of students dining on campuses across the US. You'll contribute to architecting resilient and self-healing solutions. Managing AWS infrastructure, closing observability gaps, designing scaling approaches, and shaping incident management processes are all part of your role and the development lifecycle, including building and maintaining CI/CD pipelines. Collaboration with other SRE teams is critical for guidance, knowledge sharing, and building camaraderie.


The Day to Day:

As an SRE in the "Runtime Engineering" org, you'll co-own critical production service designs, ensuring high reliability. You'll drive improvements in reliability and observability using SLOs and telemetry data. Your responsibilities include building and enhancing internal tools and automation software to maintain production services effectively and safely. Leading reliability-focused practices, such as Failure Analysis, Load and Capacity Planning, Service Reviews, Architecture Designs, and Incident Postmortems, are also part of your role.

As a senior engineer, you will also be mentoring other junior engineers.


What You Bring To the Table:

  • 5+ years of experience in Site Reliability Engineering
  • Deep knowledge of CI/CD tools such as Jenkins, GitHub actions, or similar.
  • Software Engineering experience in Python, Go, or a similar object-oriented language.
  • Microservice Architecture and Application Design Experience.
  • Distributed monitoring experience: SLOs, metrics, tracing, etc.
  • Working knowledge of Kubernetes-based software solutions and their ecosystem
  • Working knowledge of Cloud technologies (AWS, Compute/Containers, Storage, Linux, networking, etc).
  • Technical writing, documentation, and communication skills.
  • Experience with highly trafficked web-based services.


About Our Tech:

The On-Site tech stack includes Python and Go for tooling/automation and service code. We utilize New Relic and Splunk for monitoring, and our cloud technologies encompass AWS services like EKS, S3, ElastiCache, Lambda, etc. Data technologies include MongoDB, MySQL (RDS), Redis (ElasticCache), RabbitMQ, and Kafka.


This is a full time position with Grubhub in a hybrid role 2 days a week in office (office location- 121 Menachem Begin street)


Perks:

  • 20 day of paid time off per year
  • Private health insurance at company's expense
  • Food stipend when in office
  • Hybrid in office - 2 days a week

Grubhub