Comparative Analysis of Machine Learning and Deep Learning Models for Harassment and Discrimination Detection in Text
Abstract
Harassment and discrimination affect both
workplace environments and online platforms. To To
address this issue, we focus on automatically detecting
such behaviors in textual data to help create safer
digital spaces. In this article, we compare traditional
machine learning and deep learning models for detecting
harassment and discrimination. We evaluate four
approaches: TF-IDF with logistic regression, BERT-based
classification, a CNN with GloVe embeddings, and a
GRU model enhanced with attention mechanisms and
capsule networks. For all experiments, we rely on the
Everyday Sexism Project dataset, which groups the texts
into five categories: Workplace Harassment, Harassment,
Discrimination, Sexism, and Other. We evaluate their
performance applying accuracy, precision, recall, and F1.
The obtained results show that deep learning models
outperform traditional methods in identifying complex
linguistic patterns in abusive content.
workplace environments and online platforms. To To
address this issue, we focus on automatically detecting
such behaviors in textual data to help create safer
digital spaces. In this article, we compare traditional
machine learning and deep learning models for detecting
harassment and discrimination. We evaluate four
approaches: TF-IDF with logistic regression, BERT-based
classification, a CNN with GloVe embeddings, and a
GRU model enhanced with attention mechanisms and
capsule networks. For all experiments, we rely on the
Everyday Sexism Project dataset, which groups the texts
into five categories: Workplace Harassment, Harassment,
Discrimination, Sexism, and Other. We evaluate their
performance applying accuracy, precision, recall, and F1.
The obtained results show that deep learning models
outperform traditional methods in identifying complex
linguistic patterns in abusive content.
Keywords
Machine learning, deep learning, harassment, discrimination