IAB — Interpreting Agent Behavior

Overview

Motivation

As commercial agentic systems such as Claude Code and Codex see widespread real-world deployment in early 2026, their behaviors have grown increasingly complex. These systems autonomously plan, execute, and iterate on multi-step tasks, generating vast amounts of behavioral data. Yet current evaluation relies almost exclusively on automated benchmarks: pass/fail metrics that reveal whether an agent succeeded but not how it behaved, why it failed, or how humans made sense of its actions. Bridging this gap between outcome-based metrics and behavioral understanding is essential for building agents that are not only effective but also interpretable, debuggable, and trustworthy.

IAB aims to build a community that brings researchers across disciplines, including interpretability, evaluation, HCI, alignment, and computational social science. These researchers are actively developing analysis methods such as benchmarking, trace analysis, red-teaming, corpus analysis, error analysis, qualitative analysis, grounded theory, and think-aloud methods. IAB provides a venue to consolidate these efforts into shared vocabulary, open datasets, and reproducible methodology.

Scope

We plan to investigate agent behavior from three complementary perspectives.

How do agents behave?

How do agents make decisions, propose and revise plans, and select and use tools during multi-step tasks?
How do agents behave when they face ambiguity, uncertainty, or incomplete information?
How do agents fail, recover from errors, or enter failure cascades during real-world execution?
What unexpected or emergent behaviors appear in single-agent and multi-agent systems?
How do different model backbones or coordination structures shape observable agent behavior?

How do humans respond?

How do users write, adapt, and refine prompts while working with agents?
How do humans verify agent-generated outputs and decide when more checking is needed?
How do people calibrate trust in agent outputs, and when does over-reliance emerge?
How do users form and update mental models of what agents can and cannot do?
How do people interpret, monitor, and make sense of ongoing agent execution?

How do they interact?

How do humans and agents collaborate more effectively on complex, open-ended tasks?
How do humans communicate intent, constraints, and goals to agents across multi-turn interactions?
How do agents and humans negotiate misunderstandings, breakdowns, and repair during collaboration?
What patterns emerge from interaction traces, tool-use logs, and human-agent conversations over time?
How can we systematically analyze these interactions using empirical methods from HCI and the social sciences?

Invited Talks

Speakers

We thank the current speakers who are interested in giving a talk.

Armando Solar-Lezama

MIT CSAIL

Program Synthesis · Confirmed

Diyi Yang

Stanford University

Human-centered NLP · Confirmed

Bowen Baker

OpenAI

Multi-Agent Systems · Confirmed

Graham Neubig

Carnegie Mellon University

Language Agents · Tentative

Program

Schedule

Full-day workshop with keynotes, paper presentations, posters, and a panel discussion.

09:00 – 09:10Opening Remarks

09:10 – 09:45Keynote: Armando Solar-Lezama (30 min + 5 min Q&A)

09:45 – 10:20Keynote: Diyi Yang (30 min + 5 min Q&A)

10:20 – 10:50Paper Presentations (2 × 15 min)

10:50 – 11:40Poster Session #1 + Coffee Break

11:40 – 12:10Paper Presentations (2 × 15 min)

12:10 – 13:10Lunch

13:10 – 13:45Keynote: Bowen Baker (30 min + 5 min Q&A)

13:45 – 15:00Paper Presentations (5 × 15 min)

15:00 – 15:50Poster Session #2 + Coffee Break

15:50 – 16:35Panel: Armando Solar-Lezama, Diyi Yang, Bowen Baker, Graham Neubig — Empirical Methods for Understanding Agent Behavior

16:35 – 16:50Best Paper Award + Closing Remarks

Submissions

Call for Papers

We solicit two types of non-archival submissions and welcome empirical studies, datasets, methods papers, tools, and negative results on understanding agent behavior.

We particularly welcome contributions across four categories:

Datasets and Resources: annotated interaction traces, failure catalogs, human-agent conversational data, simulated behavior data
New Methods and Tools: qualitative coding frameworks, think-aloud protocols, interaction visualization, credit assignment tools, LLM-as-judge pipelines
Empirical Studies and Findings: error taxonomies, failure mode catalogs, mechanistic accounts of agent decisions, probing studies of internal states
Human-Agent Interaction: verification strategies, prompting and adaptive behaviors, intent grounding, mid-execution adjustment, joint failure recovery

We also encourage negative results and methodological position papers.

Long Papers

Up to 9 pages + references. For full empirical studies, datasets, benchmarks, or comprehensive analyses.

Short Papers

Up to 4 pages + references. For position papers, tools, demos, preliminary findings, and negative results.

Review Process

COLM-style formatting, double-anonymous review, and at least 2 reviews per submission via OpenReview.

Submission

Jun 23, 2026

Notification

Jul 24, 2026

Workshop

Oct 9, 2026

Format

Non-archival

Committee

Organizing Committee

Jie (Sophia) Gao

Johns Hopkins University

Human-AI Collaboration

Kaiser Sun

Johns Hopkins University

LLM Interpretability

Teresa Yeo

Google DeepMind

Model Robustness

Daniel Khashabi

Johns Hopkins University

Reliable Language AI

Zhuoran Lu

Accenture

Human-AI Decision Making

Boyuan Zheng

xAI

Web Agents & Safety

Katherine Van Koevering

Johns Hopkins University

Computational Social Science

Sijie Ji

Caltech

Physical AI & CPS

Jen-tse Huang

Johns Hopkins University

LLM Evaluation

Advisory Board

We thank the faculty and senior researchers who have provided guidance and support for this workshop.

Ziang Xiao (JHU) · Soufiane Hayou (JHU) · Toby Jia-Jun Li (Notre Dame) · Hang Jiang (Northeastern) · Weiyan Shi (Northeastern) · Wei Lu (Nanyang Technological University, Singapore) · Samuel Nathanson (xAI)

Motivation

Scope

How do agents behave?

How do humans respond?

How do they interact?

Speakers

Schedule

Call for Papers

Long Papers

Short Papers

Review Process

Organizing Committee

Advisory Board

Sponsors