CLASSIFICATION OF LUNG AND COLON CANCER TISSUES USING HYBRID CONVOLUTIONAL NEURAL NETWORKS

Chilyatun Nisa', Nanik Suciati, Anny Yuniarti

Abstract


Colon and lung cancers are two highly lethal kinds of cancer which can often coexist and pose a new challenge for accurate diagnosis. While research often concentrates on detecting a single cancer in a specific organ, this study proposes an innovative machine-learning approach to identify both colon and lung cancers. The objective is to create a hybrid machine learning classification model to enhance diagnostic precision. The LC25000 dataset comprises 25,000 color histopathological image samples of lung and colon cell tissues, indicating the presence or absence of cancer (adenocarcinoma). Image features are extracted using the pre-trained VGG-16 model. The cancer type is identified through three machine learning classification algorithms: Stochastic Gradient Descent (SGD), Random Forest (RF), and K-Nearest Neighbor (KNN). The model's evaluation employed a 10-fold cross-validation technique, with CNN-SGD exhibiting the highest performance based on evaluation metrics. On a scale of 0 to 100, it scored 99.8 for Area Under Curve (AUC) and 98.88 for Classification Accuracy (CA). CNN-RF, a model with performance closely following CNN-SGD, demonstrates training times 58.3 seconds faster than CNN-SGD. Meanwhile, CNN-KNN ranks last among the models evaluated in this study based on its F1, recall, AUC, and CA scores.


Full Text:

PDF

References


Kurishima, K., Miyazaki, K., Watanabe, H., Shiozawa, T., Ishikawa, H., Satoh, H., & Hizawa, N. (2018). Lung cancer patients with synchronous colon cancer. 8(1), 137–140. https://doi.org/10.3892/mco.2017.1471

Cancer Today (2020). Global Cancer Observatory (GLOBOCAN). Diakses pada Juni 12 2023, from http://gco.iarc.fr/today/home

Sasikala, S., Bharathi, M., & Sowmiya, B. (2019). Lung Cancer Detection and Classification Using Deep CNN, International Journal of Innovative Technology and Exploring Engineering (IJITEE).

Psichogios, D., dan Ungar, L., 1992. “A hybrid neural network-first principles approach to process modeling”. AIChE Journal 38, 10:1499- 1511.

Simie, E., & Kaur, M. (2019). Lung cancer detection using Convolutional Neural Network (CNN). International Journal of Advance Research, Ideas and Innovations in Technology (IJARIIT).

A. Joby, “K Nearest Neighbor (KNN): The Most Used ML Algorithm.” https://learn.g2.com/k-nearest-neighbor (accessed Aug. 26, 2023).

R. Bhatia, “How Stochastic Gradient Descent Is Solving Optimisation Problems In Deep Learning,” Analytics India Magazine, Sep. 21, 2018. https://analyticsindiamag.com/how-stochastic-gradient-descent-is-solving-optimisation-problems-in-deep-learning/ (accessed Aug. 26, 2023).

N. Donges, “What Is Random Forest? A Complete Guide | Built In,” builtin, Mar. 14, 2023. https://builtin.com/data-science/random-forest-algorithm (accessed Aug. 26, 2023).

Borkowski, A.A. et al. (2019). Lung and Colon Cancer Histopathological Image Dataset (LC25000). ArXiv,abs/1912.12142.

B. Jehangir, S. R. Nayak and S. Shandilya, "Lung Cancer Detection using Ensemble of Machine Learning Models," 2022 12th International Conference on Cloud Computing, Data Science & Engineering (Confluence), Noida, India, 2022, pp. 411-415, doi: 10.1109/Confluence52989.2022.9734212.

C. X. Ling, J. Huang, and H. Zhang, ‘AUC: A Statistically Consistent and More Discriminating Measure than Accuracy’, in Proceedings of the 18th International Joint Conference on Artificial Intelligence, Acapulco, Mexico, 2003, pp. 519–524.

V. Anand, K. S. Gill and S. Gupta, "Multi-class Classification of Colon and Lung Cancer using Deep Convolution Neural Network," 2023 International Conference on Sustainable Computing and Smart Systems (ICSCSS), Coimbatore, India, 2023, pp. 447-451, doi: 10.1109/ICSCSS57650.2023.10169254.

M. Masud, N. Sikder, A.-A. Nahid, A. K. Bairagi, and M. A. AlZain, “A Machine Learning Approach to Diagnosing Lung and Colon Cancer Using a Deep Learning-Based Classification Framework,” Sensors, vol. 21, no. 3, p. 748, Jan. 2021, doi: 10.3390/s21030748.

Garg, S., & Garg, S. (2021). Prediction of lung and colon cancer through analysis of histopathological images by utilizing Pre-trained CNN models with visualization of class activation and saliency maps. Proceedings of the 2020 3rd Artificial Intelligence and Cloud Computing Conference, 38–45. https://doi.org/10.1145/3442536.3442543

Bramantya, B. A., Fatichah, C., & Suciati, N. (2022). DETECTION AND CLASSIFICATION OF RED BLOOD CELLS ABNORMALITY USING FASTER R-CNN AND GRAPH CONVOLUTIONAL NETWORKS. JUTI: Jurnal Ilmiah Teknologi Informasi, 20 Number 1, 33–44. https://doi.org/http://dx.doi.org/10.12962/j24068535.v19i3.a1118

Pal, K.K., & Sudeep, K.S. (2016). Preprocessing for image classification by convolutional neural networks. 2016 IEEE International Conference on Recent Trends in Electronics, Information & Communication Technology (RTEICT), 1778-1781.

Crispell, D.E., Biris, O., Crosswhite, N., Byrne, J., & Mundy, J.L. (2017). Dataset Augmentation for Pose and Lighting Invariant Face Recognition. ArXiv, abs/1704.04326.

Y. LeCun, Y. Bengio, and G. Hinton, “Deep learning,” nature, vol. 521, no. 7553, pp. 436–444, 2015.

T. M. Navamani, ‘Chapter 7 - Efficient Deep Learning Approaches for Health Informatics’, in Deep Learning and Parallel Computing Environment for Bioengineering Systems, A. K. Sangaiah, Ed. Academic Press, 2019, pp. 123–137.

K. Simonyan and A. Zisserman, “Very deep convolutional networks for large-scale image recognition,” arXiv preprint arXiv:1409.1556, 2014.

J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, and L. Fei-Fei, “Imagenet: A large-scale hierarchical image database,” in 2009 IEEE conference on computer vision and pattern recognition. IEEE, 2009, pp. 248–255.

J. Brownlee, “4 Types of Classification Tasks in Machine Learning”, Machine Learning Mastery, Apr. 08, 2020. https://machinelearningmastery.com/types-of-classification-in-machine-learning/

R. Roy, “ML | Stochastic Gradient Descent (SGD),” GeeksforGeeks, Feb. 15, 2019. https://www.geeksforgeeks.org/ml-stochastic-gradient-descent-sgd/ (accessed Aug. 30, 2023).

Pedregosa, F, Varoquaux, G, Gramfort, A, Michel, V, Thirion, B, Grisel, O, Blondel, M, Prettenhofer, P, Weiss, R, Dubourg, V, Vanderplas, J, Passos, A, Cournapeau, D, Brucher, M, Perrot, M, Duchesnay, E. "Scikit-learn: Machine Learning in Python". Journal of Machine Learning Research 2011; 12:2825–2830.

J. Brownlee, “How to Develop a Random Forest Ensemble in Python”, Machine Learning Mastery, Apr. 20, 2020. https://machinelearningmastery.com/random-forest-ensemble-in-python/

J. Brownlee, “A Gentle Introduction to k-fold Cross-Validation”, Machine Learning Mastery, Aug. 23, 2020. https://machinelearningmastery.com/k-fold-cross-validation/

A. Suresh, “What is a confusion matrix?,” Analytics Vidhya, Jun. 22, 2021. https://medium.com/analytics-vidhya/what-is-a-confusion-matrix-d1c0f8feda5 (accessed Aug. 30, 2023).

Narkhede, Sarang. “Understanding AUC - ROC Curve.” Medium, 15 June 2021, https://towardsdatascience.com/understanding-auc-roc-curve-68b2303cc9c5.




DOI: http://dx.doi.org/10.12962/j24068535.v22i1.a1225

Refbacks

  • There are currently no refbacks.