Informasi Umum

Kode

25.04.7134

Klasifikasi

006.35 - Natural Language Processing, Computer Science

Jenis

Karya Ilmiah - Skripsi (S1) - Reference

Subjek

Natural Language Processing (nlp)

Dilihat

50 kali

Informasi Lainnya

Abstraksi

Online hate speech poses a significant threat to social harmony in Indonesia, necessitating effective automated detection systems. This study addresses the challenge of data imbalance, a common issue in hate speech datasets, by developing a Bidirectional Long Short-Term Memory (BiLSTM) model with FastText word embeddings. We systematically compare three oversampling techniques— Random Oversampler, SMOTE, and ADASYN—across varying degrees of imbalance in the Indonesian Hate Speech Superset dataset (14,306 comments), a gap in existing literature. Evaluated using Stratified K-fold Cross-Validation with Accuracy, Precision, Recall, and F1-score, our results indicate that oversampling generally enhances model performance, particularly for the minority class. The optimal oversampling strategy depends on imbalance severity: SMOTE achieved the best balance trade-off within Recall (78.9%) and F1-score (75.3%) on the original dataset, while Random Oversampling was superior for severely imbalanced scenarios, reaching F1-scores of 60.6% (30% minority) and 38.6% (10% minority). These findings offer vital insights for building more adaptive hate speech classification systems in the Indonesian context with imbalanced data distribution.

  • CAK4FAA4 - Tugas Akhir

Koleksi & Sirkulasi

Tersedia 1 dari total 1 Koleksi

Anda harus log in untuk mengakses flippingbook

Pengarang

Nama AKMAL MUHAMAD FAZA
Jenis Perorangan
Penyunting Yuliant Sibaroni, Sri Suryani Prasetyowati
Penerjemah

Penerbit

Nama Universitas Telkom, S1 Informatika
Kota Bandung
Tahun 2025

Sirkulasi

Harga sewa IDR 0,00
Denda harian IDR 0,00
Jenis Non-Sirkulasi