DevJobs

Site Reliability Engineer

Overview
Skills
  • C# C#
  • Python Python
  • Linux Linux
  • Windows Windows
  • AWS AWS
  • Azure Azure
  • Docker Docker
  • Ansible Ansible
  • Terraform Terraform
  • AKS
  • EKS
ControlUp optimizes the digital experience captured through real-time observation, to deliver best-in-class employee productivity.

We are looking for a Site Reliability Engineer to join our team. The candidate will help drive the next generation of ControlUp innovation using cutting-edge technology. This role is an excellent opportunity to learn and develop cutting-edge technologies and methodologies.

Responsibilities:

  • Develop automation scripts and tools to streamline operational processes
  • Implement and maintain infrastructure practices
  • Design and implement monitoring solutions to identify and address potential issues proactively
  • Participate in on-call rotations to respond to incidents and troubleshoot system outages
  • Conduct performance analysis and optimize system performance
  • Identify bottlenecks and implement solutions to improve overall system efficiency
  • Collaborate with teams to plan and forecast resource requirements
  • Ensure the scalability of systems to accommodate growing user demands
  • Promote and implement SRE best practices within the organization
  • Drive initiatives to improve system reliability and reduce operational overhead
  • Conduct post-incident reviews to analyse root causes and prevent future incidents
  • Document and share findings with relevant teams to improve overall system reliability
  • Stay informed about industry trends, emerging technologies, and best practices
  • Actively participate in knowledge-sharing activities within the SRE community


Requirements:

  • Proven experience in a Site Reliability Engineering or related role
  • Familiarity with Windows and Linux
  • Programming knowledge in languages such as C#, Python
  • Excellent troubleshooting and problem-solving skills
  • Strong communication and collaboration skillsand a “can do” attitude


Advantages:

  • Experience with containerization and orchestration tools (EKS , AKS , docker for local machines)
  • Familiarity with cloud platforms such as AWS, Azure
  • Familiarity with infrastructure automation tools ( Terraform, Ansible)
ControlUp