- About
- Events
- Calendar
- Graduation Information
- Cornell Learning Machines Seminar
- Student Colloquium
- BOOM
- Fall 2024 Colloquium
- Conway-Walker Lecture Series
- Salton 2024 Lecture Series
- Seminars / Lectures
- Big Red Hacks
- Cornell University - High School Programming Contests 2024
- Game Design Initiative
- CSMore: The Rising Sophomore Summer Program in Computer Science
- Explore CS Research
- ACSU Research Night
- Cornell Junior Theorists' Workshop 2024
- People
- Courses
- Research
- Undergraduate
- M Eng
- MS
- PhD
- Admissions
- Current Students
- Computer Science Graduate Office Hours
- Advising Guide for Research Students
- Business Card Policy
- Cornell Tech
- Curricular Practical Training
- A & B Exam Scheduling Guidelines
- Fellowship Opportunities
- Field of Computer Science Ph.D. Student Handbook
- Graduate TA Handbook
- Field A Exam Summary Form
- Graduate School Forms
- Instructor / TA Application
- Ph.D. Requirements
- Ph.D. Student Financial Support
- Special Committee Selection
- Travel Funding Opportunities
- Travel Reimbursement Guide
- The Outside Minor Requirement
- Diversity and Inclusion
- Graduation Information
- CS Graduate Minor
- Outreach Opportunities
- Parental Accommodation Policy
- Special Masters
- Student Spotlights
- Contact PhD Office
Title:Language Models as World Models
Abstract:The extent to which language modeling induces representations of the world described by text—and the broader question of what can be learned about meaning from text alone—have remained a subject of ongoing debate across NLP and cognitive sciences. I'll discuss a few pieces of recent work aimed at understanding whether (and how) representations in transformer LMs linearly encode interpretable and controllable representations of facts and situations. I'll begin by presenting evidence from probing experiments suggesting that LM representations encode (rudimentary) information about entities' properties and dynamic state, and that these representations are causally implicated downstream language generation. Despite this, even today's largest LMs are prone to glaring semantic errors: they hallucinate facts, contradict input text, or even their own previous outputs. Building on our understanding of how LM representations influence behavior, I'll describe a "representation editing" model called REMEDI that can correct these errors by intervening directly in LM activations. I'll with some recent experiments that complicate this story: much of LMs' "knowledge" remains inaccessible to readout or manipulation with simple probes. A great deal of work is still needed to build language generation systems with fully transparent and controllable models of the world.
Bio: Jacob Andreas is the X Consortium Assistant Professor at MIT. His research aims to build intelligent systems that can communicate effectively using language and learn from human guidance. Jacob earned his Ph.D. from UC Berkeley, his M.Phil. from Cambridge (where he studied as a Churchill scholar) and his B.S. from Columbia. He has been named a National Academy of Sciences Kavli Fellow, and has received the NSF CAREER award, MIT's Junior Bose and Kolokotrones teaching awards, and paper awards at NAACL and ICML.