Enhancing Face Detection Performance In 360-Degree Video Using Yolov8 with Equirectangular Augmentation Techniques
DOI:
https://doi.org/10.12962/j24068535.v23i1.a1255Abstract
This study aims to enhance face detection performance in 360-degree videos by utilizing advanced image augmentation techniques with the YOLOv8 algorithm, which is effective for real-time object detection. Acknowledging the unique challenges posed by equirectangular projection, this research introduces a novel equirectangular augmentation method specifically designed for this medium. Our findings demonstrate a remarkable 1.346% improvement in detection accuracy in Equirectangular Projection (ERP) settings compared to default YOLOv8 augmentation strategies. This significant enhancement not only addresses the geometric distortions inherent in panoramic video formats but also emphasizes the critical need for tailored augmentation approaches to improve face detection in complex environments. By showcasing the effectiveness of these customized methods, this research contributes to the growing field of deep learning applications for immersive video technologies, with implications for sectors like security, virtual reality, and interactive media. Ultimately, this work highlights the potential of innovative augmentation techniques to ensure robust face detection in challenging visual contexts.References
J. K. Author, “Title of chapter in the book,” in Title of His Published Book, xth ed. City of Publisher, Country if notfdksj Xu, C. Li, S. Zhang, and P. L. Callet, “Stateofthe-art in 360° video/image processing: Perception, assessment and compression,” IEEE Journal of Selected Topics in Signal Pro-cessing, vol. 14, no. 1, pp. 5–26, Jan. 2020, ISSN: 1941-0484. DOI: 10.1109/JSTSP.2020. 2966864.
[C.-Y. Yang and H. H. Chen, “Efficient face detection in the fisheye image domain,” IEEE Transactions on Image Processing, vol. 30, pp. 5641–5651, 2021. DOI: 10.1109/TIP.2021.3087400.
. Fu, S. Ranjbar Alvar, I. Bajic, and R. Vaughan, “Fddb-360: Face detection in 360-degree fisheye images,” in 2019 IEEE Conference on Multime-dia Information Processing and Retrieval (MIPR), 2019, pp. 15– 19. DOI: 10.1109/MIPR.2019.00011.
R. G. d. A. Azevedo, N. Birkbeck, F. De Simone, I. Janatra, B. Adsumilli, and P. Frossard, “Visual distortions in 360° videos,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 30, no. 8, pp. 2524–2537, 2020. DOI: 10 . 1109 / TCSVT. 2019 . 2927344.
W. Yang and Z. Jiachun, “Real-time face detection based on yolo,” in 2018 1st IEEE International Conference on Knowledge Innovation and In-vention (ICKII), 2018, pp. 221–224. DOI: 10.1109/ICKII.2018.8569109.
X. Wang, K. Wang, and S. Lian, “A survey on face data augmentation for the training of deep neural networks,” Neural Computing and Applica-tions, vol. 32, no. 19, pp. 15 503–15 531, Mar. 2020, ISSN: 1433-3058. DOI: 10 . 1007 / s00521 - 020 - 04748 - 3. [Online]. Available: http://dx.doi.org/10.1007/s00521-020-04748-3.
S. Yang, P. Luo, C. C. Loy, and X. Tang, Wider face: A face detection benchmark, 2015. arXiv: 1511.06523 [cs.CV].
C. Shorten and T. M. Khoshgoftaar, “A survey on image data augmentation for deep learning,” Journal of Big Data, vol. 6, no. 1, p. 60, 2019, ISSN: 2196-1115. DOI: 10.1186/s40537-019-0197-0. [Online]. Available: https: //doi.org/10.1186/s40537-019-0197-0.
M. D. Bloice, C. Stocker, and A. Holzinger, Augmentor: An image augmentation library for machine learning, 2017. arXiv: 1708.04680 [cs.CV].
M. Sohan, T. Sai Ram, and C. V. Rami Reddy, “A review on yolov8 and its advancements,” in Data Intelligence and Cognitive Informatics, I. J. Jacob, S. Piramuthu, and P. Falkowski-Gilski, Eds., Singapore: Springer Nature Singapore, 2024, pp. 529–545, ISBN: 978-981-99-7962-2.
L. Li, K. Jamieson, A. Rostamizadeh, et al., A system for massively parallel hyperparameter tuning, 2020. arXiv: 1810 . 05934 [cs.LG]. [Online]. Available: https://arxiv.org/abs/1810.05934.
B. Wang, A parallel implementation of computing mean average precision, 2022. arXiv: 2206.09504 [cs.CV]. [Online]. Available: https://arxiv.org/abs/2206.09504.
J. Du, “Understanding of object detection based on cnn family and yolo,” Journal of Physics: Conference Series, vol. 1004, no. 1, p. 012 029, Apr. 2018. DOI: 10. 1088/1742- 6596/1004/1/012029. [Online]. Available: https://dx.doi.org/10.1088/1742-6596/1004/1/012029.
Downloads
Published
Issue
Section
License
All papers should be submitted electronically. All submitted manuscripts must be original work that is not under submission at another journal or under consideration for publication in another form, such as a monograph or chapter of a book. Authors of submitted papers are obligated not to submit their paper for publication elsewhere until an editorial decision is rendered on their submission. Further, authors of accepted papers are prohibited from publishing the results in other publications that appear before the paper is published in JUTI unless they receive approval for doing so from the Editor-in-Chief.
JUTI open access articles are distributed under a Creative Commons Attribution-ShareAlike 4.0 International License. This license lets the audience to give appropriate credit, provide a link to the license, and indicate if changes were made and if they remix, transform, or build upon the material, they must distribute contributions under the same license as the original.