Perpustakaan Fakultas Ilmu Komputer

Call Number	T-1431 (softcopy T-1140) MAK PI-231 TR-CSUI-103 Source Code-396
Collection Type	Tesis
Title	Ekspansi Data Menggunakan Forward-Backward Translation untuk Deteksi Ujaran Kebencian Multi-Label dalam Bahasa Indonesia
Author	Fairuz Astari Devianty;
Publisher	Depok: Fasilkom UI, 2023
Subject	hate speech, multi-label classification,
Location	FASILKOM-UI;

Lokasi : Perpustakaan Fakultas Ilmu Komputer

Nomor Panggil	ID Koleksi	Status
T-1431 (softcopy T-1140) MAK PI-231 TR-CSUI-103 Source Code-396		TERSEDIA

MAK PI-231 Fairuz Astari Devianty_1906457123.pdf

Source Code-396 Fairuz Astari Devianty_1906457123.zip

TR-CSUI-103 Fairuz Astari Devianty_1906457123.pdf

T-1431 (softcopy T-1140) Fairuz Astari Devianty_1906457123.pdf

Tidak ada review pada koleksi ini: 56241

ABSTRAK

Name : Fairuz Astari Devianty Study Program : Magister Ilmu Komputer Title : Data Expansion using Forward-Backward Translation for Multi-Label Hate Speech Detection in Bahasa Indonesia Counsellor : Bayu Distiawan Trisedya, S.Kom., M.Kom., Ph.D. Meganingrum Arista Jiwanggi, S.Kom., M.Kom., M.C.S. The growth and development of social media platforms make communication easier. However, this can be misused. For example, the spread of hate speech via social media is increasing. Freedom of speech is everyone's right in Indonesia, but malicious content must be eliminated due to its negative impact. One solution is to build a model that can automatically detect hate speech. Building a good hate speech detection model requires a large amount of annotated data to train the model. It is also necessary to pay attention to the target, category, and level of hate speech. However, there is currently only one multi-label hate speech dataset in Bahasa Indonesia available and the proportion of data for each label is unequal. To overcome this data scarcity problem, we propose a forward-backward translation method to generate data automatically. This method combines forward and backward translation. A forward translation is performed for dataset in high-resource languages and a backward translation is performed for dataset in low-resource languages. By combining these two processes, the resulting dataset will have a large amount of data and good translation quality. This method will be used to add data on multi-label hate speech detection in Bahasa Indonesia with additional data from English. As a result of this study, the performance of multi-label hate speech detection in the new dataset improved compared to the existing Bahasa Indonesia hate speech dataset. This dataset gets an F1-score of 0.76 for multi-label classification and an F1-score of 0.78 for hierarchical classification