Kawin Ethayarajh

I am an Assistant Professor of Applied AI and Kathryn and Grant Swick Faculty Scholar at UChicago Booth, where I work on behavioral machine learning.

Machine learning is not a sterile industrial process; much in the way that it is hardware-bound and software-bound, it is also bound by the behavior of real-world actors such as workers, firms, and states. By borrowing from fields like economics, my work tries to formalize this behavior and create algorithms, tools, and systems that are compatible with actual actors, not just idealized ones.

My work has received an ICML 2022 Outstanding Paper award, a Facebook Fellowship, and an NSERC PGS-D. Prior to UChicago, I graduated with a PhD in Computer Science from Stanford University, where I was advised by Dan Jurafsky. I have also spent time at Princeton Language & Intelligence (post-doc) and the University of Toronto (MSc, BSc).

Notable work:

A new framework for understanding dataset difficulty, based on the notion of V-usable information. This led to the development of Stanford Human Preferences (SHP), the first large-scale open-source dataset of human preferences over text (5M examples in v2.0), based on data from Reddit and StackExchange. SHP was the only dataset from academia used in the reward modeling of Llama-2, one of the most downloaded LLMs to-date.
- Understanding Dataset Difficulty with V-Usable Information.
  Kawin Ethayarajh, Yejin Choi, and Swabha Swayamdipta.
  ICML 2022 (outstanding paper - top 10 of 1233 accepted).
  paper tweet 1 tweet 2 code dataset 1 dataset 2
A family of objectives for aligning generative models with feedback, based on prospect theory from behavioral economics. Not only did this show that existing objectives (PPO, DPO) unintentionally capture qualities of human utility functions, such as loss aversion, but that we could design even better objectives by intentionally incorporating human biases. One of these objectives, KTO, is the go-to option for aligning LLMs on binary feedback, especially in class-imbalanced settings that are typical of the real world.
- Model Alignment as Prospect Theoretic Optimization.
  Kawin Ethayarajh, Winnie Xu, Niklas Muennighoff, Dan Jurafsky, and Douwe Kiela.
  ICML 2024 (spotlight - top 3.5% of accepted).
  paper tweet code press