Gambling Comments Detection on Youtube: A Comparative Study of Tree-Based Boosting, LSTM and GRU Models

Authors

  • Agung Widiyanto Telkom University Purwokerto
  • Mayesq Prameswari Telkom University Purwokerto
  • Muhammad Abdul Latief Telkom University Purwokerto
Views: 331 Downloads: 302

DOI:

https://doi.org/10.12962/j24068535.v23i2.a1305

Abstract

The exponential growth of online gambling in Indonesia poses significant socio-economic challenges, particularly affecting vulnerable populations through sophisticated digital marketing strategies targeting social media platforms. This study addresses the critical need for automated detection systems to identify gambling-related content in YouTube comments. We scraped and manually labeled 11,673 comments from diverse YouTube videos, creating an extremely imbalanced dataset with gambling comments representing only 10% of the total data. Multiple machine learning approaches were developed and evaluated, comparing traditional gradient boosting methods (LightGBM, XGBoost, CatBoost) using TF-IDF features against deep learning models (LSTM & GRU) with Word2Vec embeddings. The experimental results demonstrate that gradient boosting methods significantly outperform deep learning approaches in generalization capability. LightGBM achieved the highest holdout F1-score with balanced precision (0.8912) and recall (0.8886), while XGBoost followed closely with comparable performance. In contrast, deep learning models exhibited severe overfitting, with GRU and LSTM showing excellent test performance but drastically reduced holdout recall (0.5022 and 0.4844, respectively). The findings indicate that the dataset size was insufficient for deep learning approaches to learn generalizable representations effectively. For practical deployment in YouTube gambling content detection, gradient boosting methods are recommended due to their superior performance with limited, imbalanced datasets.

Downloads

Download data is not yet available.

Downloads

Published

2025-07-08

Issue

Section

Articles

How to Cite

[1]
A. Widiyanto, M. Prameswari, and M. Abdul Latief, “Gambling Comments Detection on Youtube: A Comparative Study of Tree-Based Boosting, LSTM and GRU Models”, JUTI, vol. 23, no. 2, pp. 144–160, Jul. 2025, doi: 10.12962/j24068535.v23i2.a1305.