Multi-label classification of Arabic text using deep learning models

Loading...
Thumbnail Image

Date

2025-06-15

Journal Title

Journal ISSN

Volume Title

Publisher

Mohamed Boudiaf University of M'sila

Abstract

In multi-label text classification, multiple related labels are assigned to relevant documents for more refined categorization. The research attempts to build a system that can effective ly categorize Arabic texts into multiple themes using the multi-label dataset NADiA1, which contains 35,404 files across 24 categories. Deep learning approaches, encompassing QARiB, MARBERT, and AraBERT, were used, with data preprocessing conducted using the pyAra bic package to maintain text quality. The models were assessed by accuracy, precision, recall, and Hamming loss. Among these, transformer-based AraBERT outclassed its peers by giving 95.76% accuracy and a micro F1-score of 0.81, followed by QARiB (95.48% accuracy and 0.80 micro F1-score) and MARBERT (94.99% accuracy and 0.77 micro F1-score). This study lays emphasis that deep transformer-based learning techniques are highly effective in multi-label Arabic text classification, with AraBERT showing the ability to better handle linguistic com plexities

Description

Keywords

Multi-label Text Classification, Deep Learning, QARiB, MARBERT, AraBERT, Arabic NLP, Arabic text, MARBERT, transformers

Citation

Collections