Eve Fleisig
PhD student at UC Berkeley | AI Ethics + NLP
I'm a fifth-year PhD student in computer science at UC Berkeley, advised by Dan Klein. My research lies at the intersection of natural language processing (NLP) and AI ethics: how can we design language models that we trust to benefit all users, without perpetuating societal harms? To do so, I work on training and evaluating language models to serve complex distributions of users with varied needs. This includes learning from informative disagreement among users, evaluating discrimination against users who speak differently, and designing frameworks for LLMs that serve populations with many perspectives.
Previously, I earned a BSE in computer science with a minor in linguistics at Princeton University, advised by Christiane Fellbaum.
My research is supported by an NSF Graduate Research Fellowship and Berkeley Chancellor's Fellowship.
Recent News
-
✈️ I'm attending EMNLP and NeurIPS 2025. Come say hi!
-
💬 [Nov '25] Invited panelist for NLPerspectives at EMNLP 2025.
-
💬 [Oct '25] Invited to the 2025 RCAIS doctoral consortium.
-
✈️ [Oct '25] Visiting Dirk Hovy at the Milan NLP lab this fall.
-
💬 [Aug '25] Invited talk at Edinburgh NLP on GRACE.
-
🏆 [July '25] My 3-minute thesis won 2nd place at the LSA Summer Institute.
-
🏆 [May '25] Our AdvScore paper won Outstanding Paper at NAACL 2025.
-
💬 [May '24] Invited talk at Stanford NLP on Linguistic Bias in ChatGPT.
Selected Work
Please see Google or Semantic Scholar for an up-to-date list.
Balancing Quality and Variation: Spam Filtering Distorts Data Label Distributions
Eve Fleisig*, Matthias Orlikowski*, Philipp Cimiano, Dan Klein
GRACE: A Granular Benchmark for Evaluating Model Calibration against Human Calibration
Yoo Yeon Sung*, Eve Fleisig*, Yu Hou, Ishan Upadhyay, Jordan Boyd-Graber (ACL 2025).
Is your benchmark truly adversarial? AdvScore: Evaluating Human-Grounded Adversarialness
Yoo Yeon Sung, Maharshi Gor, Eve Fleisig, Ishani Mondal, Jordan Boyd-Graber (NAACL 2025 - Outstanding Paper Award).
[ PAPER ] [ TWITTER ]
Linguistic Bias in ChatGPT: Language Models Reinforce Dialect
Discrimination
Eve Fleisig*, Genevieve Smith*, Madeline Bossi*, Ishita Rustagi*, Xavier Yin*, Dan Klein (EMNLP 2024).
[ PAPER ] [ TWITTER ] [ BLOG ]
Mapping Social Choice Theory to RLHF
Jessica Dai, Eve Fleisig (R2FM @ ICLR 2024).
When the Majority is Wrong: Modeling Annotator Disagreement for Subjective Tasks
Eve Fleisig, Rediet Abebe, Dan Klein (EMNLP 2023).
Incorporating Worker Perspectives into MTurk Annotation Practices for NLP
Olivia Huang, Eve Fleisig, Dan Klein (EMNLP 2023 - Outstanding Paper Award).
Ghostbuster: Detecting Text Ghostwritten by Large Language Models
Vivek Verma, Eve Fleisig, Nicholas Tomlin, Dan Klein (NAACL 2024).
[ PAPER ] [ TWITTER ] [ BLOG ]
The Perspectivist Paradigm Shift: Assumptions and Challenges of Capturing Human Labels
Eve Fleisig, Su Lin Blodgett, Dan Klein, Zeerak Talat (NAACL 2024).
Hedges and Apologies in ChatGPT Responses to African-American English
Eve Fleisig (NWAV 2023).
First Tragedy, then Parse: History Repeats Itself in the New Era of Large Language Models
Naomi Saphra, Eve Fleisig, Kyunghyun Cho, Adam Lopez (NAACL 2024).
FairPrism: Evaluating fairness-related harms in text generation
Eve Fleisig, Aubrie Amstutz, Chad Atalla, Su Lin Blodgett, Hal Daumé III, Alexandra Olteanu, Emily Sheng, Dan Vann, Hanna Wallach (ACL 2023).
Mitigating Gender Bias in Machine Translation through Adversarial Learning
Eve Fleisig, Christiane Fellbaum
Outstanding Senior Thesis Award; Sigma Xi Book Award
Mentorship
I've mentored some wonderful undergraduate students, including Samuel Ghezae, Olivia Huang (→Citadel), Harbani Jaggi (→Applied Intuition), Kayla Lee (→YC startup founder), Kashyap Murali (→Anthropic), Vyoma Raman (→Stanford), Mahathi Ryali, Zaina Shaik (→Amazon), Vivek Verma (→OpenAI), and Xavier Yin (→CMU).
I am not currently taking on new undergraduates for research projects. However, I maintain a resource guide for students interested in NLP research, and I'm happy to chat about research or anything else!
Interested in chatting or collaborating?
Reach out to me at efleisig :at: berkeley :dot: edu
Miscellaneous
My surname is pronounced /'flʌɪsɪg/ ("fly"-sihg).
I'm a member of ACF, a volunteer-run organization that produces high-quality collegiate quizbowl tournaments.
🇦🇷¡Siempre estoy feliz de charlar con otros latinoamericanos!




