Explainable Affective Body Expression Recognition with Multi-Scale Spatiotemporal Encoding and LLM-Based Reasoning

Tao Wang, Haifeng Lu, Jiayi Duan, Tianyu Meng, Rui Mao, Shuang Liu*, Dong Ming*, IEEE Transactions on Affective Computing, 2026, Early Access.

Jan 1, 2026

PDF Project DOI

Abstract

This work presents an explainable affective body expression recognition framework that integrates multi-scale spatiotemporal encoding with LLM-based reasoning. The framework uses MSCMNet to encode body movement patterns across scales, bidirectional state-space modeling to capture temporal dependencies, and an Emotion-Action Interpreter to generate human-readable explanations. A spatiotemporal semantic understanding module and cross-dataset joint training further improve generalization. Experiments show accuracy improvements of up to 7.83% and stronger explainable reasoning than general-purpose multimodal large language models such as GPT-4o and Gemini 1.5 Pro.

Type

Journal article

Publication

IEEE Transactions on Affective Computing

More details about this article are available at this link.