DevJobs

SRE (Site Reliability Engineer)

Overview
Skills
  • Bash Bash ꞏ 4y
  • Python Python ꞏ 4y
  • Git Git ꞏ 4y
  • Jenkins Jenkins ꞏ 4y
  • AWS AWS ꞏ 4y
  • Azure Azure
  • GCP GCP
  • Kubernetes Kubernetes ꞏ 4y
  • Grafana Grafana ꞏ 4y
  • Splunk ꞏ 4y
  • Ansible Ansible
  • Terraform Terraform
  • Linux operating systems ꞏ 4y
  • Observability tools ꞏ 4y
  • Prometheus Prometheus ꞏ 4y
  • Data Dog ꞏ 4y
  • Containers ꞏ 4y
  • Virtualization ꞏ 4y
  • Gitaction
Description

eToro is seeking a highly skilled and motivated Site Reliability Engineer (SRE) to join our dynamic team. Your role as a SRE will be to ensure our infrastructure and applications are reliable, scalable, and perform well. You will collaborate closely with cross-functional teams to design, build, and maintain resilient systems that meet the needs of our customers and business stakeholders.

Responsibilities:

  • Collaborate with R&D engineers on coordination, communication, and execution of production-related operations
  • Design, implement, and maintain scalable and reliable infrastructure solutions to support our applications and services.
  • Develop and deploy monitoring, alerting, and logging systems to proactively identify and mitigate operational issues.
  • Build a SRE dashboard with KPI to measure eToro’s application reliability.
  • Conduct capacity planning and performance tuning to optimize system performance and resource utilization for improved user experience.
  • Automate repetitive tasks and processes to streamline operations and improve efficiency.
  • Participate in incident response and resolution, including root cause analysis and post-mortem reviews.
  • Continuously evaluate and adopt new technologies and methodologies to enhance our infrastructure and operations.
  • Documentation and Knowledge Sharing: Create and maintain documentation, runbooks, and knowledge base articles to document system configurations, procedures, and best practices.

Requirements

  • 4+ years’ as a DevOps/SRE/Integration engineer with a passion for technology and strong motivation to build highly reliable solutions.
  • In-depth knowledge of Observability tools (Prometheus, Splunk, Data Dog, Grafana).
  • Git, Jenkins, Gitaction(preferred), Virtualization, Containers, Kubernetes.
  • Cloud providers: AWS / Azure (preferred) / GCP.
  • Excellent understanding of Linux operating systems and scripting languages (Python, Bash).
  • Strong communication skills, both verbal and written, with the ability to adapt the messaging to different perspectives (technical, business) and levels of detail.
  • Ability to grasp new technologies quickly and prioritize and multitask on multiple responsibilities
  • Excellent problem-solving skills and the ability to work effectively in a fast-paced, dynamic environment.
  • Experience with , Ansible, Terraform - an advantage