Detection of Tendency to Depression through Text Analysis
Abstract
A project is proposed with the objectiveof detecting tendencies toward depression throughtext analysis, using Natural Language Processingtechnologies and Large Language Models (LLM). Thedevelopment included several phases, such as theselection and preprocessing of English transcripts fromthe low-resource Distress Analysis Interview Corpus Wizard of Oz (DAIC-WOZ) dataset [18, 19], as well asthe training of models based on Transformer architectures, specifically Bidirectional Encoder Representationsfrom Transformers (BERT), Robustly Optimized BERTApproach (RoBERTa), and Decoding-enhanced BERTwith Disentangled Attention (DeBERTa). The resultshighlight the performance of the BERT fine-tuningmodel, which achieved better metrics compared to theother architectures evaluated (RoBERTa and DeBERTafine-tuning models), with an average F1 score of0.76 and a consistently high Receiver OperatingCharacteristic – Area Under the Curve (ROC-AUC) value> 0.82. This demonstrates its ability to balance precisionand sensitivity, as well as identify linguistic patternsassociated with depressive symptoms.
Keywords
Linguistic patterns, BERT, tendencies, depression, fine-tuning