Overview
Come join the
LLM Dev and Evaluation team as a
Staff Data Scientist.
We are building the
Intuit Foundational LLM, as part of a proprietary Generative AI operating system (GenOS) platform.
What you'll bring
What It Takes
- Strong NLP and LLM knowledge: experience with NLP techniques and LLM technologies for the last 1.5 years
- Passion for Emerging AI Technologies - Demonstrated interest in cutting-edge advancements in NLP, LLMs, generative AI, machine learning, and deep learning, with a focus on staying ahead of the latest developments in transformer architectures, self-supervised learning, and model fine-tuning.
- Robust Technical Expertise in Data Science and LLMs - Strong foundational understanding of the data science principles underlying LLMs, including tokenization, embeddings, pre-training and fine-tuning methods, data augmentation, and prompt engineering—not just training models.
- Global Collaboration:- Proven ability to collaborate with cross-functional teams and partners worldwide to deliver highly complex, LLM-focused projects that address unique business challenges.
- Adaptability and Independence: Maturity, quick learning abilities, and the flexibility to thrive in a fast-paced, innovation-driven environment, adapting to evolving LLM techniques and tools.
- Exceptional Communication Skills: strong verbal and written communication skills, with the ability to lead discussions, conduct professional presentations, and explain LLM and AI concepts to non-technical stakeholders in a clear, accessible manner.
- Project and Stakeholder Management: proven expertise in managing complex LLM projects, aligning with multiple stakeholders, and driving data-driven initiatives to completion on time and with impact.
Advantages
- We welcome people who can deliver E2E AI projects (inception to production). We primarily use Python in all stages of development
- Fluent in SQL enough to get the data you need from a warehouse (Vertica, Hive, SparkSQL)
- Comfortable working in a Linux environment
- Experience with building end-to-end reusable pipelines from data acquisition to model output delivery
How you will lead
How You’ll Lead
- You will leverage cutting-edge techniques in natural language processing (NLP) and large language models (LLMs)
- to work with diverse, multi-modal data types and massive scales, using proprietary Intuit data to unlock insights. Apply advanced model fine-tuning, transfer learning, prompt engineering, few-shot learning, and data augmentation methods to build both predictive and generative models, fueling innovation across Intuit products.
- You will apply deep LLM expertise and independent judgment to collaborate with cross-functional teams—data engineers, ML architects, product managers, and business analysts—to develop high-performance LLM pipelines. Design and execute research strategies for optimizing model architecture, prompt optimization, tokenizer customization, data curation, noise reduction, and hyper-parameter tuning to meet Intuit’s complex and large-scale data challenges.
- You will provide actionable, real-time guidance to stakeholders on utilizing LLM models, embeddings, and vector databases to meet critical business needs. Advise teams on deploying LLM-driven insights for unique business cases, empowering them to make informed, data-driven decisions at scale and stay aligned with advancements in generative AI, reinforcement learning, and self-supervised learning.
- You will lead the end-to-end development of LLM workflows, encompassing hypothesis generation, model fine-tuning, data preprocessing, A/B testing, visualization, and interpretability methods. Foster a continuous feedback loop for model retraining, precision tuning, and seamless deployment, ensuring alignment with shifting data scales and complex multi-domain applications.
- You will empower business leaders with interpretability tools and a strategic understanding of LLM outputs. Provide essential entrepreneurial guidance on deploying actionable insights and scaling models to maximize ROI and drive impact. Help Intuit fully exploit cutting-edge AI capabilities, integrating emerging advancements in transformers, attention mechanisms, and multi-task learning.