Glassbox is seeking a DevOps Engineer to join our global DevOps team.
We are Glassbox, and our mission is to reveal the insights that empower organizations to deliver exceptional digital customer experiences.
We are growing and have been recognized by G2 as one of 2024's Top 50 Software Companies in the world.
Our customers are the best of the best and include six out of the ten largest global banks, the world's largest hotel chain, the largest healthcare, and the largest telecommunications company in the U.S.
Now is the perfect time to come to Glassbox and help us accelerate our global leadership position!
If you are a dynamic, successful, experienced metrics-driven leader, Glassbox might be a great fit.
Will you join us on this journey?
What will you do?
You will work with cutting-edge technologies and tackle a wide range of challenges in building a
scalable, globally distributed infrastructure.
In this role, you will:
- Work with a diverse set of technologies, simplifying complex solutions.
- Collaborate closely with multiple teams across the organization.
- Be part of a team responsible for designing, optimizing, and maintaining a high-scale production environment that handles massive traffic loads with high complexity.
- Architect, deploy and maintain robust and scalable cloud infrastructures on AWS and Azure.
- Develop and optimize CI/CD pipelines to support automated deployment, testing, and scaling across multiple environments.
- Implement and manage monitoring, logging, and alerting solutions across cloud platforms to ensure application health and performance.
- Provide advanced troubleshooting and resolution for infrastructure issues in production, development, and testing environments.
This is a unique opportunity to work with state-of-the-art technologies while contributing to the scalability, performance, and reliability of a mission-critical system.
What will you need?
- 4+ years of experience in a DevOps or related engineering role, with a strong background in AWS and cloud-native environments. Proven ability to design, manage, and maintain high-scale production systems, ensuring reliability, performance, and scalability.
- Deep expertise in cloud technologies and application security, with a strong focus on best practices for securing resilient, scalable, and cost-efficient architectures.
- Extensive experience with containerization and orchestration technologies, including Docker, Kubernetes, and Helm.
- Proficiency in automation tools like Terraform and hands-on experience with CI/CD pipelines using tools like Jenkins.
- Strong knowledge of monitoring and logging tools (e.g., Prometheus, Grafana, Loki) to ensure system health, optimize performance, and proactively detect issues.
- Proficiency in scripting languages (e.g., Node.js, Bash) to automate workflows and enhance operational efficiency.
- Excellent communication, collaboration, and documentation skills, with a proactive approach to problem-solving.
- Passionate about learning new technologies and tackling complex challenges in a fast-paced environment.
An Advantage
- Proficiency in database technologies, including Cassandra, Elasticsearch, ClickHouse, and PostgreSQL HA.
- Expertise in Kafka cluster administration, including cross-region replication and high-availability configurations.
- Experience with MLOps workflows and infrastructure, including model deployment, monitoring, and scaling in production environments.
- Hands-on experience managing big data infrastructure, optimizing performance and scalability for data-intensive applications.