Depression recognition using a proposed speech chain model fusing speech production and perception features

Minghao Du, Shuang Liu*, Tao Wang, Wenquan Zhang, Yufeng Ke, Long Chen, Dong Ming*, Journal of Affective Disorders, 2023, 323:299-308.

Feb 15, 2023

PDF Project DOI

Abstract

Audio-based depression recognition is a useful auxiliary tool for early screening, but many existing methods focus mainly on speech perception features and overlook vocal-tract changes. This work proposes a machine speech chain model for depression recognition (MSCDR), which captures text-independent depressive speech representations from speech production to speech perception. Linear predictive coding and Mel-frequency cepstral coefficients are extracted to characterize speech generation and perception, and deep sequential modeling is used to capture intra- and inter-segment depressive features. Experiments on two public datasets show accuracies of 0.77 and 0.86, indicating the complementary value of speech production and perception features for depression analysis.

Type

Journal article

Publication

Journal of Affective Disorders

More details about this article are available at this link.

Journal Article