Upcoming Events

SCP Events

IC Spring Seminar Series with Guest Speaker Omar Khattab

Khattab_Omar.jpg

Abstract

It is now easy to build impressive demos with language models (LMs) but turning these into reliable systems currently requires brittle combinations of prompting, chaining, and finetuning LMs. I will present LM programming, a systematic way to address this by defining and improving three layers of the LM stack. I start with how to adapt LMs to search for information most effectively (ColBERT) and how to scale such search to billions of tokens (PLAID). I then discuss the right architectures and supervision strategies (e.g., ColBERT-QA, Baleen, Hindsight) for allowing LMs to retrieve and cite verifiable sources in their responses. This leads to DSPy, a programming model that introduces composable modules for building and automatically supervising controllable programs built with LMs. Even simple systems expressed in DSPy routinely outperform large standalone LMs and standard hand-crafted prompting pipelines, in some cases while using only small models. I highlight how ColBERT and DSPy have sparked applications at dozens of leading tech companies and academic labs. I then conclude by discussing how DSPy enables a new degree of research modularity around LMs, one that stands to allow open research to again lead the development of AI systems. 

Bio

Omar is a fifth-year CS Ph.D. candidate at Stanford NLP, whose work spans Natural Language Processing (NLP), Information Retrieval (IR), and Machine Learning (ML) Systems. His research creates models, supervision strategies, and programming abstractions for building reliable, transparent, and scalable NLP systems. Omar is the author of the ColBERT retrieval model, which has helped shape the modern landscape of neural information retrieval. His lines of work on ColBERT and DSPy form the basis of influential open-source projects, exceeding 600,000 downloads per month, and have sparked applications at Google, Amazon, IBM, VMware, Databricks, Baidu, AliExpress, and numerous startups. Omar's Ph.D. has been supported by the Eltoukhy Family Graduate Fellowship and the Apple Scholars in AI/ML PhD Fellowship.