Comparison of Pneumonia Detection Using Convolutional Neural Network (CNN) Method on Balanced and Imbalanced Datasets
Abstract
Pneumonia is a serious respiratory disease that requires early detection to improve clinical outcomes and reduce mortality rates. Radiologists usually perform pneumonia detection through X-ray images, but this method is prone to human error and delays in diagnosis. Therefore, there is an urgent need to develop automated technology-based systems that can assist in the rapid and accurate diagnosis of pneumonia. This study aims to develop a Convolutional Neural Network (CNN) model to detect pneumonia in X-ray images. One of the main challenges in developing this model is the imbalance in the dataset, where the number of X-ray images labeled "pneumonia" may be significantly fewer compared to images labeled "normal." Dataset imbalance can cause the model to be biased toward the majority class and reduce detection accuracy. The CNN architecture was first tested on an imbalanced dataset. The test results of a 2-layer convolutional CNN achieved an accuracy of 93.10%, while a 3- layer convolutional CNN achieved an accuracy of 96.03%. Subsequently, the CNN architecture was tested on a balanced dataset using traditional augmentation methods. The test results of a 2-layer convolutional CNN achieved an accuracy of 93.84%, while a 3-layer convolutional CNN achieved an accuracy of 96.46%. It can be concluded that the CNN architecture using a balanced dataset has better accuracy compared to using an imbalanced dataset