Pre-trained Model Sentiment Analysis of Tunisian Telecommunications Operators’ Comments on Social Media
Abstract
Sentiment analysis (SA) has emerged as a crucial computational method for extracting subjective information from text, facilitating organizations to transform unstructured opinions through actionable insights that drive strategic decision-making across domains covering from business intelligence to public policy formation [46]. Pre-training models for SA have gained significant attention for improving opinion extraction from text. In recent years, social media has become a crucial platform for customer engagement, with SA playing a key role in maintaining client loyalty. Extracting sentiments from comments and reviews is particularly challenging for under-resourced languages like the Tunisian Dialect (TD), which is written in both Arabizi and Arabic scripts. Despite advancements in SA, processing TD remains complex. In this study, BERT and CNN-Bidirectional LSTM models are employed to perform SA on unstructured data collected from Facebook. The dataset, TUNisian TElecom Sentiment Analysis (TUNTESA), consists of 27,080 Arabizi and 17,816 Arabic comments sourced from official telecommunications operators’ Facebook pages. The comments are labeled as positive, negative, or neutral. The results demonstrate high accuracy (Acc), with the BERT Arabic model achieving 0.99 and the BERT Arabizi model reaching 0.94-outperforming existing studies. These findings highlight the practical applications of SA for businesses leveraging social media interactions. By effectively analyzing sentiments, telecom operators can enhance customer satisfaction, manage relationships, and extract valuable feedback, ultimately maintaining a competitive edge.
Keywords
Sentiment analysis, social media, telecom operators, tunisian dialect