Fetcherr experts in deep learning, e-commerce, and digitization, Fetcherr disrupts traditional systems with its cutting-edge AI technology. At its core is the Large Market Model (LMM), an adaptable AI engine that forecasts demand and market trends with precision, empowering real-time decision-making. Specializing initially in the airline industry, Fetcherr aims to revolutionize industries with dynamic AI-driven solutions.
Fetcherr is looking for a highly skilled and experienced Senior Python Developer to spearhead the development of robust infrastructure and services that power our Large Language Model (LLM) initiatives. You will be instrumental in building the scalable, reliable, and efficient systems that enable our LLM engineers to develop, deploy, and manage cutting-edge AI applications. This role requires a deep understanding of Python, cloud technologies, and a passion for building foundational systems that support advanced AI. If you are a seasoned developer with a proven track record in backend development and infrastructure, and you're excited about enabling the future of AI at Fetcherr, we want to hear from you.
Responsibilities:
- Design, build, and maintain scalable, high-performance backend services and infrastructure for Fetcherr's LLM ecosystem.
- Develop and manage APIs and microservices that facilitate the interaction between LLM models, data pipelines, and end-user applications.
- Implement and optimize data pipelines for ingesting, processing, and serving data relevant to LLM training and inference.
- Ensure the reliability, security, and efficiency of LLM deployment environments through robust infrastructure management.
- Collaborate with LLM Engineers and Data Scientists to understand their infrastructure needs and provide tailored solutions.
- Establish and maintain best practices for Python development, including coding standards, testing (unit, integration, E2E), reproducibility, and version control for infrastructure code.
- Leverage cloud platforms (e.g., GCP) to build and manage scalable infrastructure, including compute, storage, and networking resources.
- Implement monitoring, logging, and alerting systems to ensure the health and performance of LLM services and infrastructure.
- Contribute to the architectural decisions and strategic direction for Fetcherr's AI infrastructure.
- Stay abreast of industry trends and best practices in Python development, cloud computing, and MLOps.
Requirements:
You'll be a great fit if you have...
- 4+ years of professional experience in backend software development, with a strong emphasis on Python.
- Proven experience in building and managing production systems and scalable infrastructure.
- Strong understanding of building applications that scale and operate reliably in a cloud environment.
- Expertise in designing and implementing APIs and microservices.
- Solid experience with cloud platforms, particularly GCP (preferred), including services for compute, storage, networking, and managed databases.
- Demonstrated experience with good coding practices for testing, reproducibility, and version control (e.g., Git).
- Familiarity with containerization technologies such as Docker and orchestration platforms like Kubernetes.
- Experience with CI/CD pipelines and tools for automated testing and deployment.
- Proficiency in database technologies (SQL and NoSQL).
- Excellent problem-solving, analytical, and debugging skills.
- Strong communication and collaboration skills, with the ability to articulate technical concepts effectively.
Nice to have:
- Hands-on experience with MLOps practices and tools (e.g., Kubeflow, MLflow, TensorFlow Extended).
- Experience with serving LLMs, including model optimization techniques and efficient inference.
- Knowledge of LLM frameworks and libraries (e.g., LangChain, Transformers).
- Experience with data streaming technologies (e.g., Kafka, Pub/Sub).
- Understanding of security best practices for cloud infrastructure and applications.
- Familiarity with infrastructure-as-code tools (e.g., Terraform, CloudFormation).