Explainable BERT Embeddings for Veracity Assessment in Criminal Investigations

Thoha Haq; Chastine Fatichah; Anny Yuniarti

doi:10.12962/j24068535.v24i1.a1327

Explainable BERT Embeddings for Veracity Assessment in Criminal Investigations

Authors

Thoha Ikhwanul Haq Institut Teknologi Sepuluh Nopember
Chastine Fatichah Institut Teknologi Sepuluh Nopember
Anny Yuniarti Institut Teknologi Sepuluh Nopember

Views: 284 Downloads: 323 DOI: https://doi.org/10.12962/j24068535.v24i1.a1327

Abstract

The binary classification of truth and lies is often a detriment in criminal investigations as statements are intentionally not entirely true nor entirely false. This ambiguity in the veracity of their claims demands more extensive methods such as explainable models. Explainable models, particularly SHapley Additive exPlanations (SHAP), can help dissect statements and narrow down information for a more thorough investigation. Data from the Miami University Deception Database, comprising of various statements and their veracity, was analyzed for its linguistic features. This research utilizes Bidirectional Encoder Representations from Transformers (BERT) Embeddings to provide contextual understanding of statements and Sentiment Lexicons to provide domain specific knowledge. Results show that the R² (coefficient of determination) of the 2-Gram embedding performed the best at 0.39 by being able to capture more context than the 1-Gram embedding while being more general than the 3-Gram and 4-Gram embeddings. Each variant of the BERT Embedding was proven to be much more effective than general word embedding such as GloVe, Word2Vec and FastText. SHAP values were able to capture key points of interest in a statement by narrowing down pivotal and decision-making points. These results highlight potential indicators of either deceptive or truthful language such as the word ‘something’ and ‘our’. These points of interest can help humans focus on key points of investigation and intervention.

Downloads

Download data is not yet available.

Downloads

FULL TEXT

Published

2026-01-15

Issue

Vol. 24, No. 1, January 2026

Section

Articles

License

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

All papers should be submitted electronically. All submitted manuscripts must be original work that is not under submission at another journal or under consideration for publication in another form, such as a monograph or chapter of a book. Authors of submitted papers are obligated not to submit their paper for publication elsewhere until an editorial decision is rendered on their submission. Further, authors of accepted papers are prohibited from publishing the results in other publications that appear before the paper is published in JUTI unless they receive approval for doing so from the Editor-in-Chief.

JUTI open access articles are distributed under a Creative Commons Attribution-ShareAlike 4.0 International License. This license lets the audience to give appropriate credit, provide a link to the license, and indicate if changes were made and if they remix, transform, or build upon the material, they must distribute contributions under the same license as the original.

How to Cite

[1]

T. Haq, Chastine Fatichah, and Anny Yuniarti, “Explainable BERT Embeddings for Veracity Assessment in Criminal Investigations”, JUTI, vol. 24, no. 1, pp. 35–45, Jan. 2026, doi: 10.12962/j24068535.v24i1.a1327.

Download Citation

Explainable BERT Embeddings for Veracity Assessment in Criminal Investigations

Authors

Abstract

Downloads

Downloads

Published

Issue

Section

License

How to Cite

Journal Information

Template

Scopus Citedness

Indexed By

Acceptance Rate

References Style

Stat Counter

Visitor Counter

Additional Menu