Pneumonia Detection Using Partial Transfer Learning on Depthwise Convolutions

The risk of having pneumonia is a quite high for numerous people, especially for the ones in developing nations where billions face shortage in energy and rely on toxic forms of energy. The World Health Organization estimates that over 4 million premature deaths occur annually from household air pollution-related diseases including pneumonia. Over 150 million people get infected with pneumonia on an annual basis especially children under 5 years old. In such regions, the problem can be further aggravated due to the dearth of medical resources and personnel. For example, in 57 nations of Africa, there is a shortage of 2.3 million doctors and nurses. For these populations, accurate and fast diagnosis mean everything.

The Dataset

The dataset can be found from Mendely Labeled Optical Coherence Tomography (OCT) and Chest X-Ray Images for Classification. It is organized into 3 folders (train, test, val) and contains subfolders for each image category (Pneumonia/Normal). There are 5,863 X-Ray images (JPEG) and 2 categories (Pneumonia/Normal).

Chest X-ray images (anterior-posterior) were selected from retrospective cohorts of pediatric patients of one to five years old from Guangzhou Women and Children’s Medical Center, Guangzhou. All chest X-ray imaging was performed as part of patients’ routine clinical care. For the analysis of chest x-ray images, all chest radiographs were initially screened for quality control by removing all low quality or unreadable scans.

The diagnoses for the images were then graded by two expert physicians before being cleared for training the AI system. In order to account for any grading errors, the evaluation set was also checked by a third expert.

Normal Lung Bacterial Pneumonia Lung Viral Pneumonia Lung

The normal chest X-ray (left panel) depicts clear lungs without any areas of abnormal opacification in the image. Bacterial pneumonia (middle) typically exhibits a focal lobar consolidation, whereas viral pneumonia (right) manifests with a more diffuse interstitial pattern in both lungs.

The Problem

The aim of this project is to develop a robust deep learning model from scratch on this limited amount of data. We all know that deep learning models are data hungry but if we know how things work, we can build good models even with a limited amount of data.

Exploratory Data Analysis

Ben Graham's Method

Canny Edge Detection

Background Substraction

The Model

Since we have less data, we should learn, but wisely. We will be doing partial transfer learning and rest of the model will be trained from scratch. This is the approach we will follow when it comes to building deep learning models from scratch on limited data.