A novel data-mining model for automated prediction of low birth weight

Show simple item record

dc.contributor.supervisor Selvaraj, Rajalakshmi
dc.contributor.supervisor Galani, Malatsi Hange, Uzapi 2020-12-02T08:00:03Z 2020-12-02T08:00:03Z 2017-04-03
dc.identifier.citation Hange, U. (2017) A novel data-mining model for automated prediction of low birth weight, Masters Thesis, Botswana International University of Science and Technology: Palapye en_US
dc.description.abstract Birth weight is one of the major factors that determine the overall future health outcomes. Predicting birth weight can enable medical practitioners to make early obstetric interventions, thus minimising complications associated with low birth weight. Data-mining models are receiving a great deal of attention for making predictions based on a vast amount of low birth data. Low birth weight research has especially focused on identifying the risk factors of low birth weight However, the prediction of actual bi1th weight values based on the identified low birth weight risk factors, which can play a significant role in the identification of mothers at the risk of delivering low birth weight infants, remains unsolved. Since the performance of data-mining techniques is dependent on the: underlying problem, it is vital to analyse their relative performances for any given task. Therefore, the goal of this thesis was to develop a data-mining model that predicts the actual birth weight with a relatively higher Area Under the receiver operating Characteristic (AUC). The prediction was based on low birth weight risk factors and birth data from the North Carolina State Centre for Health Statistics of 2006. ln order to extract interesting patterns from data the knowledge discovery in databases process model was utilized. The steps followed were data selection, data pre-processing, model building, and model evaluation/interpretation. Decision trees were used for classifying birth weight and tested on the actual imbalanced datasets, the balanced dataset using Synthetic Minority Oversampling Technique (SMOTE), as well as with all the features and reduced feature using correlation-based feature selection algorithm. The results highlighted that models built with balanced datasets using the SMOTE algorithm produce a relatively higher AUC comparative to models built with imbalanced datasets. It was also discovered that building models with reduced features through correlation-based feature selection algorithm, give a comparatively higher AUC as opposed to models built with all features. The J48 decision tree built with reduced features outperformed REPTree and Random tree with an AUC of90.3%, and thus it was selected as the best model. When applying the selected J48 model to new unseen data, and comparing .it to the testing set, we reached a conclusion that the feasibility of using J48 in birth weight prediction would offer the possibility to reduce obstetric related complications and thus improving the overall obstetric healthcare. en_US
dc.description.sponsorship Botswana International University of Science and Technology en_US
dc.language.iso en en_US
dc.publisher Botswana International University of Science and Technology (BIUST) en_US
dc.subject Birth weight en_US
dc.subject Low birth weight en_US
dc.subject Data-mining en_US
dc.subject SMOTE en_US
dc.subject Imbalanced dataset en_US
dc.title A novel data-mining model for automated prediction of low birth weight en_US
dc.description.level msc en_US
dc.description.accessibility unrestricted en_US
dc.description.department cis en_US

Files in this item

This item appears in the following Collection(s)

  • Faculty of Sciences
    This collection is made up of electronic theses and dissertations produced by post graduate students from Faculty of Sciences

Show simple item record



My Account