Gambling Comments Detection on Youtube: A Comparative Study of Tree-Based Boosting, LSTM and GRU Models
DOI:
https://doi.org/10.12962/j24068535.v23i2.a1305Abstract
The exponential growth of online gambling in Indonesia poses significant socio-economic challenges, particularly affecting vulnerable populations through sophisticated digital marketing strategies targeting social media platforms. This study addresses the critical need for automated detection systems to identify gambling-related content in YouTube comments. We scraped and manually labeled 11,673 comments from diverse YouTube videos, creating an extremely imbalanced dataset with gambling comments representing only 10% of the total data. Multiple machine learning approaches were developed and evaluated, comparing traditional gradient boosting methods (LightGBM, XGBoost, CatBoost) using TF-IDF features against deep learning models (LSTM & GRU) with Word2Vec embeddings. The experimental results demonstrate that gradient boosting methods significantly outperform deep learning approaches in generalization capability. LightGBM achieved the highest holdout F1-score with balanced precision (0.8912) and recall (0.8886), while XGBoost followed closely with comparable performance. In contrast, deep learning models exhibited severe overfitting, with GRU and LSTM showing excellent test performance but drastically reduced holdout recall (0.5022 and 0.4844, respectively). The findings indicate that the dataset size was insufficient for deep learning approaches to learn generalizable representations effectively. For practical deployment in YouTube gambling content detection, gradient boosting methods are recommended due to their superior performance with limited, imbalanced datasets.
Downloads
Downloads
Published
Issue
Section
How to Cite
License
Copyright (c) 2025 Agung Widiyanto, Mayesq Prameswari, Muhammad Abdul Latief

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
All papers should be submitted electronically. All submitted manuscripts must be original work that is not under submission at another journal or under consideration for publication in another form, such as a monograph or chapter of a book. Authors of submitted papers are obligated not to submit their paper for publication elsewhere until an editorial decision is rendered on their submission. Further, authors of accepted papers are prohibited from publishing the results in other publications that appear before the paper is published in JUTI unless they receive approval for doing so from the Editor-in-Chief.
JUTI open access articles are distributed under a Creative Commons Attribution-ShareAlike 4.0 International License. This license lets the audience to give appropriate credit, provide a link to the license, and indicate if changes were made and if they remix, transform, or build upon the material, they must distribute contributions under the same license as the original.