Jie (Sophia) Gao
@Data Science and AI Institute
@Center for Language and Speech Processing
About
I am currently a Malone Postdoc Fellow at Johns Hopkins University, where I am mentored by Mark Dredze, Ziang Xiao, and Chien-Ming Huang. I was fortunate to gain multiple kinds of training I currently rely on through postdoc, Ph.D., and visiting student experiences. Previously, I was a Postdoctoral Associate at MIT's SMART program in Singapore, advised by Thomas W. Malone, where I learned collective intelligence and the theoretical perspective of human-AI teams. I received my Ph.D. from SUTD, advised by Simon Perrault, where I gained foundational HCI training. During my Ph.D., I was a visiting student at the University of Notre Dame, hosted by Toby Jia-Jun Li, where I learned to design innovative human-AI collaboration, and at the National University of Singapore, hosted by Shengdong Zhao, where I learned to run rigorous empirical user studies.
AI agents are here.
How do we understand them? How do we control them?
Fundamentally, I am fascinated by how people identify patterns and derive reusable principles from messy, ambiguous, and complex situations and phenomena. Text and code are my entry points into them. This is why I am drawn to analytical methods such as thematic analysis, grounded theory, content analysis, and taxonomy building. These methods help people turn complexity into understanding. To achieve this goal, I use human-AI collaboration to make these methods more accessible, simplified, and supported, while keeping human judgment and reasoning.
Research
Three threads of human-AI collaboration, each pairing a different human role with AI on a different interpretive task.
-
Agent Behavior Analysis
Helping humans understand agentsAs AI systems take on increasingly autonomous roles, the bottleneck shifts from "can they do it" to "can humans tell whether they did it well." I am currently exploring how humans can stay meaningfully in the loop with autonomous AI systems.
-
Code Comprehension
Helping humans understand codeReal-world codebases are messy, and newcomers often struggle to build an accurate mental model. I build tools that help developers comprehend unfamiliar code, review AI-generated changes critically, and collaborate with AI assistants in ways that preserve developer judgment.
-
Qualitative Analysis
Helping humans understand unstructured textHow can AI support, rather than replace, the interpretive labor of reading unstructured text? I design and evaluate systems that help people code qualitative data, build shared meaning in collaborative analysis, and turn raw text into grounded theories, while keeping human judgment central.
News
- [2026.04] 🎉 Our CodeMap paper received an award at ICPC 2026! 🏆 ACM SIGSOFT Distinguished Paper Award
- [2026.04] 🎤 Happy to give a talk at NLP for Computational Social Science (Slides)
- [2026.04] 🎤 Happy to give a talk at Advanced HCI at JHU (Slides)
- [2026.03] 🎤 Gave a Claude Code How-To Session (Slides) for 40+ JHU researchers (Master's students, PhD students, postdocs, and faculty): covering practical usage and best practices of Claude Code and OpenClaw, provided live demos.
- [2025.12] 🎉 Our paper on supporting developers in understanding new codebases has been accepted to the 34th IEEE/ACM International Conference on Program Comprehension (ICPC 2026)! Check out the preprint: Understanding Codebase like a Professional! Human-AI Collaboration for Code Comprehension | CodeMap Website
- [2025.11] ✈️ I attended EMNLP2025 in Suzhou, China. We presented our position paper, From Noise to Nuance: Enriching Subjective Data Annotation through Qualitative Analysis on generalizing qualitative data analysis to subjective data annotation domain.
Travel
- [2026.04] ICLR 2026, Rio de Janeiro, Brazil (cancelled — visa issues)
- [2026.04] ICSE / ICPC 2026, Rio de Janeiro, Brazil presenting (cancelled — visa issues)
- [2025.11] EMNLP 2025, Suzhou, China presenting
- [2025.10] VL/HCC 2025, North Carolina, USA
- [2025.04] CHI 2025, Yokohama, Japan
- [2024.05] CHI 2024, Honolulu, USA presenting