Ananya Kumar

square_smiling_slightly_turned_main I am a Research Scientist at OpenAI working on the Science of Deep Learning. Before that, I was a PhD student at Stanford University advised by Percy Liang and Tengyu Ma. Here is my CV.

Research overview: Machine learning is undergoing a paradigm shift where we pretrain foundation models (general-purpose models learned from large unlabeled datasets) and then transfer these models to learn a wide range of tasks of interest. Examples of this paradigm include CLIP, SimCLR, and GPT-4.

I have built some of the first theoretical understanding of these methods. I have transformed the theoretical results into improved practical algorithms – our methods have led to state-of-the-art accuracies on ImageNet (most popular machine learning benchmark) and applications such as satellite remote sensing, wildlife conservation, and radiology.

A lot of my work focuses on developing machine learning models that can be reliably deployed in the wild, e.g., robustness to distribution shifts, uncertainty calibration, and safety & fairness.

To build a community around these topics, we are organizing an exciting new workshop at ICLR 2023 called ME-FoMo (Mathematical and Empirical Understanding of Foundation Models). Please submit!

Selected Publications

(See all publications at this link)

Fine-Tuning can Distort Pretrained Features and Underperform Out-of-Distribution. Ananya Kumar, Aditi Raghunathan, Robbie Jones, Tengyu Ma, Percy Liang. International Conference on Learning Representations (ICLR Oral) 2022. 1.6% oral acceptance rate. [Slides]

Connect, Not Collapse: Explaining Contrastive Learning for Unsupervised Domain Adaptation. Kendrick Shen*, Robbie Jones*, Ananya Kumar*, Sang Michael Xie*, Jeff Z. HaoChen, Tengyu Ma, Percy Liang. International Conference on Machine Learning (ICML Long Talk) 2022. 2.1% long talk acceptance rate. [Slides]

In-N-Out: Pre-Training and Self-Training using Auxiliary Information for Out-of-Distribution Robustness. Sang Michael Xie*, Ananya Kumar*, Robbie Jones*, Fereshte Khani, Tengyu Ma, Percy Liang. International Conference on Learning Representations (ICLR) 2021.

Understanding Self-Training for Gradual Domain Adaptation. Ananya Kumar, Tengyu Ma, Percy Liang. International Conference on Machine Learning (ICML) 2020.

Verified Uncertainty Calibration. Ananya Kumar, Percy Liang, Tengyu Ma. Neural Information Processing Systems (NeurIPS Spotlight) 2019. 3.0% spotlight / oral acceptance rate.

More Detailed Research Summary

To understand foundation models, we can decompose them into three key stages (Pretrain, Transfer, and Deploy; Figure below).

research_statement_figure_1

I have done fundamental work on each stage of the pipeline.

How to pretrain. We show that contrastive pretraining on unlabeled data from many domains, and then transferring to labeled data from one domain, improves accuracy even on the domains where we had no labels (ICLR 2021, ICML 2022 Long Talk, NeurIPS 2022). We found that pretraining works in very different ways from conventional intuitions on domain invariance. Our theory leads to improved methods for pretraining.
How to transfer. We showed that standard approaches for transfer (e.g., fine-tuning) can be unreliable (ICLR 2022 Oral). To understand this, we developed the first theory examining how pretrained representations evolve during the process of transfer. Our theory led to better transfer algorithms. In follow–up works, we have used these to get state-of-the-art accuracy on WILDS Camelyon (tumor detection), WILDS FMoW (satellite remote sensing), and WILDS iWildCam (wildlife conservation), and our methods are a core component of state-of-the-art on ImageNet.
Reliable deployment in the wild. My goal is to build ML systems that are reliable, and I collaborate with researchers in areas such as sustainability, ethics, radiology, and natural language processing to achieve this. We’ve built methods and theory to improve robustness to distribution shifts (ICML 2020, NeurIPS 2020, ICLR 2021, ICLR Oral 2022), uncertainty calibration (NeurIPS 2019 Spotlight, ICLR 2021, UAI 2022) including in large language models, and AI safety and fairness (ICLR 2019, NeurIPS 2022).

Students Advised

I have been lucky to co-advise a number of talented undergraduate and master’s students at Stanford, who have written some very insightful papers:

Fahim Tajwar: ICML Workshop 2021, ICLR 2023, Preprint 2023
- Next: PhD Student at CMU
Kendrick Shen: ICLR 2022, ICML 2022
- Next: ML Research Engineer at Genesis Therapeutics
Michael Sun: Intern project on continual learning
- Next: PhD Student at MIT
Robbie Jones: ICLR 2021, ICLR 2022, ICML 2022
- Next: ML Software Engineer at GridSpace
Vaish Srivastava: Ongoing projects on uncertainty quantification

I have also mentored or proposed research directions for a number of fantastic PhD students who have taught me a lot:

Sachin Goyal (CMU): CVPR 2023 (Robust fine-tuning)
Jeff Z. HaoChen: NeurIPS 2022 (Pretraining for robustness)
Rishi Bommasani: NeurIPS 2022 (Fairness of foundation models)
Nelson Liu: ACL 2023 (Robustness of NLP models)
Alex Li (CMU): Intern project on distribution shifts

Collaborators and Advisors

I spent a wonderful summer working with Suriya Gunasekar and Sebastien Bubeck in the Microsoft Research Foundations Group. During my PhD I’ve also been lucky to collaborate with Aditi Raghunathan, Sang Michael Xie, Chelsea Finn, and Zico Kolter, and learn a lot from John Duchi. Before my PhD, I spent a fun year working at DeepMind, and before that did undergraduate research work with Avrim Blum, Guy Blelloch, and Bob Harper.