Description
We are monday.com , a global software company transforming how businesses operate. Our versatile product suite serves diverse industries and use cases, empowering approximately 245,000 customers worldwide to reimagine work processes, enhance efficiency, and scale operations. With over 2,500 employees globally, we prioritize transparency, knowledge sharing, and the impact of your contributions over the hours you clock. We support our team with flexible work arrangements, wellness and mental health resources, and a collaborative work environment.
We are seeking a 
Senior DevOps Engineer to join our 
AI R&D group within the platform, where you will build new infrastructure features and directly impact the future of our product. You will take part in implementing generative AI capabilities within monday.com AI platform, including defining the strategy of using AI models, creating the foundation for cross-team implementation and productizing it as part of our platform offering.
The AI R&D group operates as a startup entity within our company. With a group of over 30 individuals spanning product development, developers, data scientists, design, analysts, marketing, and more. We possess all the necessary resources to build an exceptional organization and drive rapid growth for the company.
This is a unique opportunity to join a disruptive product group in its hyper-growth stage. The product is rapidly expanding and reaching an increasing number of users each month, and we are seeking a seasoned engineer to tackle the scale challenges associated with our high-growth and high-throughput framework.
About The Role
- Work closely with platform and product teams, taking full responsibility and ownership from conception to post-deployment.
 - Work closely with the core infrastructure team, leveraging monday.com infrastructure standards and best practices.
 - Be the owner of our AI core infrastructure across multiple AI providers and self-hosted models.
 - Be the team's go-to person and authority on everything infrastructure-related.
 - Take full ownership of our platform infrastructure, building 0 to 1 processes to handle the scale of billions of daily actions per month, while having the support and resources of the monday ecosystem.
 - Our Stack - AWS, GCP, Azure, React.JS, Redux, TypeScript, node.js, Elastic Search, Redis, MySQL, Docker, Cassandra
 
Requirements
- Strong technical skills, good system and infrastructure understanding
 - Familiarity with ML / AI development cycle: experimentation, training and inference - an advantage
 - Experience with building infrastructure for LLM applications, including agentic products in production at scale
 - Experienced with building the full application release cycle (CI/CD).
 - Familiar with how modern web applications work and scale.
 - Deploying and maintaining self-hosted models with dedicated GPU infrastructure - an advantage
 - Networking, firewall rules management and application security knowledge.
 - Ability to see the bigger picture and carry out system architecture planning.
 - Understanding of product and a passion for building software that impacts millions of users.
 - Team player, egoless, strong communication skills, and empathetic.