Research overview: Machine learning is undergoing a paradigm shift where we pretrain foundation models (general-purpose models learned from large unlabeled datasets) and then transfer these models to learn a wide range of tasks of interest. Examples of this paradigm include CLIP, SimCLR, and GPT-3.
I have built some of the first theoretical understanding of these methods. I have transformed the theoretical results into improved practical algorithms – our methods have led to state-of-the-art accuracies on ImageNet (most popular machine learning benchmark) and applications such as satellite remote sensing, wildlife conservation, and radiology.
A lot of my work focuses on developing machine learning models that can be reliably deployed in the wild, e.g., robustness to distribution shifts, uncertainty calibration, and safety & fairness.
To build a community around these topics, we are organizing an exciting new workshop at ICLR 2023 called ME-FoMo (Mathematical and Empirical Understanding of Foundation Models). Please submit!
(See all publications at this link)
Fine-Tuning can Distort Pretrained Features and Underperform Out-of-Distribution. Ananya Kumar, Aditi Raghunathan, Robbie Jones, Tengyu Ma, Percy Liang. International Conference on Learning Representations (ICLR Oral) 2022. 1.6% oral acceptance rate. [Slides]
Connect, Not Collapse: Explaining Contrastive Learning for Unsupervised Domain Adaptation. Kendrick Shen*, Robbie Jones*, Ananya Kumar*, Sang Michael Xie*, Jeff Z. HaoChen, Tengyu Ma, Percy Liang. International Conference on Machine Learning (ICML Long Talk) 2022. 2.1% long talk acceptance rate. [Slides]
In-N-Out: Pre-Training and Self-Training using Auxiliary Information for Out-of-Distribution Robustness. Sang Michael Xie*, Ananya Kumar*, Robbie Jones*, Fereshte Khani, Tengyu Ma, Percy Liang. International Conference on Learning Representations (ICLR) 2021.
Understanding Self-Training for Gradual Domain Adaptation. Ananya Kumar, Tengyu Ma, Percy Liang. International Conference on Machine Learning (ICML) 2020.
Verified Uncertainty Calibration. Ananya Kumar, Percy Liang, Tengyu Ma. Neural Information Processing Systems (NeurIPS Spotlight) 2019. 3.0% spotlight / oral acceptance rate.
More Detailed Research Summary
To understand foundation models, we can decompose them into three key stages (Pretrain, Transfer, and Deploy; Figure below).
I have done fundamental work on each stage of the pipeline.
- How pretraining learns representations for transfer. We show that pretraining on unlabeled data from many domains, and then transferring to labeled data from one domain, improves accuracy even on the domains where we had no labels (ICLR 2021, ICML 2022 Long Talk, NeurIPS 2022). We found that pretraining works in very different ways from conventional intuitions on domain invariance.
- Improving transfer. We showed that standard approaches for transfer (e.g., fine-tuning) can be unreliable (ICLR 2022 Oral). To understand this, we developed the first theory examining how pretrained representations evolve during the process of transfer. Our theory led to better transfer algorithms. In follow–up works, we have used these to get state-of-the-art accuracy on WILDS Camelyon (tumor detection), WILDS FMoW (satellite remote sensing), and WILDS iWildCam (wildlife conservation), and our methods are a core component of state-of-the-art on ImageNet.
- Reliable deployment in the wild. My goal is to build ML systems that are reliable, and I collaborate with researchers in areas such as sustainability, ethics, radiology, and natural language processing to achieve this. We’ve built methods and theory to improve robustness to distribution shifts (ICML 2020, NeurIPS 2020, ICLR 2021, ICLR Oral 2022), uncertainty calibration (NeurIPS 2019 Spotlight, ICLR 2021, UAI 2022) including in large language models, and AI safety and fairness (ICLR 2019, NeurIPS 2022).
I have been lucky to co-advise a number of talented undergraduate and master’s students at Stanford, who have written some very insightful papers:
- Robbie Jones: ICLR 2021, ICLR 2022, ICML 2022
- Fahim Tajwar: ICML Workshop 2021, Preprint 2022
- Kendrick Shen: ICLR 2022, ICML 2022
- Michael Sun: Ongoing work on continual representation learning
- Ansh Khurrana: Ongoing work on robustness in healthcare
I have also mentored (and often proposed research directions) for a number of fantastic PhD students who have taught me a lot:
- Jeff Z. HaoChen: NeurIPS 2022 (Pretraining for robustness)
- Rishi Bommasani: NeurIPS 2022 (Fairness of foundation models)
- Nelson Liu: Preprint 2022 (Robustness of NLP models)
- Sachin Goyal (CMU): Preprint 2022 (Robust fine-tuning)
- Alex Li (CMU): Intern project on distribution shifts
Collaborators and Advisors
I spent a wonderful summer working with Suriya Gunasekar and Sebastien Bubeck in the Microsoft Research Foundations Group. During my PhD I’ve also been lucky to collaborate with Aditi Raghunathan, Sang Michael Xie, Chelsea Finn, and Zico Kolter, and learn a lot from John Duchi. Before my PhD, I spent a fun year working at DeepMind, and before that did my undergraduate thesis with Avrim Blum.