최진우 교수 연구실 (Vision and Learning Lab), NeurIPS 2025에 비디오 설명가능 인공지능 Spotlight 논문 accept
최진우 교수 연구실 (Vision and Learning Lab)의 컴퓨터공학과 석사과정 이종서와 학부 연구생 이우일이 작성한 논문이 Machine learning 분야 세계 최고 권위의 학술대회인 Neural Information Systems Processing (NeurIPS) 2025에 Spotlight으로 선정되었습니다. 이는 경희대 최초의 Machine learning 분야 Top Conference (NeurIPS, ICML, ICLR) Spotlight 논문 선정 사례입니다. 본 논문은 2025년 12월 미국 샌디에이고에서 발표할 예정입니다.
[논문정보]
Title: Disentangled Concepts Speak Louder Than Words: Explainable Video Action Recognition
Authors: Jongseo Lee, Wooil Lee, Gyeong-Moon Park*, Seong Tae Kim* and Jinwoo Choi* (* 교신 저자)
Venue: Neural Information Systems Processing (NeurIPS) 2025
TL; DR
We propose DANCE, a framework that explains video action recognition models in a structured, motion-aware manner by disentangling concept types into motion dynamics, objects, and scenes.
Abstract
Effective explanations of video action recognition models should disentangle how movements unfold over time from the surrounding spatial context. However, existing methods—based on saliency—produce entangled explanations, making it unclear whether predictions rely on motion or spatial context. Language-based approaches offer structure but often fail to explain motions due to their tacit nature—intuitively understood but difficult to verbalize. To address these challenges, we propose Disentangled Action aNd Context concept-based Explainable (DANCE) video action recognition, a framework that predicts actions through disentangled concept types: motion dynamics, objects, and scenes. We define motion dynamics concepts as human pose sequences. We employ a large language model to automatically extract object and scene concepts. Built on an ante-hoc concept bottleneck design, DANCE enforces prediction through these concepts. Experiments on four datasets—KTH, Penn Action, HAA500, and UCF-101—demonstrate that DANCE significantly improves explanation clarity with competitive performance. We validate the superior interpretability of DANCE through a user study. Experimental results also show that DANCE is beneficial for model debugging, editing, and failure analysis.
2025.09.23