Call Number | T-1390 (Softcopy T-1099) MAK PI-190 TR-CSUI-063 Source Code-382 |
Collection Type | Tesis |
Title | Monolingual Bert is Better Than Multilingual Bert for Natural Language Inferevce in sw Ahili |
Author | Hajra Faki Ali; |
Publisher | Depok, Fasilkom UI, 2024 |
Subject | Natural Language Inference |
Location | FASILKOM-UI; |
Nomor Panggil | ID Koleksi | Status |
---|---|---|
T-1390 (Softcopy T-1099) MAK PI-190 TR-CSUI-063 Source Code-382 | TERSEDIA |
Name : Hajra Faki Ali Study Program : Master of Computer Science Title : Monolingual BERT Is Better Than Multilingual BERT For Natural Language Inference In Swahili Supervisor : Adila Alfa Krisnadhi, S.Kom, M.Sc., Ph.D This research proposes the development of a monolingual model for Natural Language Inference (NLI) in Swahili to overcome the limitations of current multilingual models. The study fine-tunes the pre-trained SwahBERT model to capture Swahili's unique semantic relationships and contextual nuances. A critical component of this research is the creation of a SwahiliNLI dataset, crafted to reflect the intricacies of the language, thereby avoiding reliance on translated English text. Furthermore, the performance of the fine-tuned SwahBERT model is evaluated using both SwahiliNLI and the XNLI dataset, and compared with the multilingual mBERT model. The results reveal that the SwahBERT model outperforms the multilingual model, achieving an accuracy rate of 78.78% on the SwahiliNLI dataset and 73.51% on the XNLI dataset. The monolingual model also exhibits superior precision, recall, and F1 scores, particularly in recognizing linguistic patterns and predicting sentence pairings. This research underscores the importance of using manually generated datasets and monolingual models in lowresource languages, providing valuable insights for the development of more efficient and contextually relevant NLI systems, thereby advancing natural language processing for Swahili and potentially benefiting other languages facing similar resource constraints. Keywords: Monolingual, Multilingual, Natural Language Inference, Swahili, SwahBERT