Multi-label classification of Arabic text using deep learning models
Loading...
Date
2025-06-15
Journal Title
Journal ISSN
Volume Title
Publisher
Mohamed Boudiaf University of M'sila
Abstract
In multi-label text classification, multiple related labels are assigned to relevant documents
for more refined categorization. The research attempts to build a system that can effective ly categorize Arabic texts into multiple themes using the multi-label dataset NADiA1, which
contains 35,404 files across 24 categories. Deep learning approaches, encompassing QARiB,
MARBERT, and AraBERT, were used, with data preprocessing conducted using the pyAra bic package to maintain text quality. The models were assessed by accuracy, precision, recall,
and Hamming loss. Among these, transformer-based AraBERT outclassed its peers by giving
95.76% accuracy and a micro F1-score of 0.81, followed by QARiB (95.48% accuracy and 0.80
micro F1-score) and MARBERT (94.99% accuracy and 0.77 micro F1-score). This study lays
emphasis that deep transformer-based learning techniques are highly effective in multi-label
Arabic text classification, with AraBERT showing the ability to better handle linguistic com plexities
Description
Keywords
Multi-label Text Classification, Deep Learning, QARiB, MARBERT, AraBERT, Arabic NLP, Arabic text, MARBERT, transformers