Cloud Based Network Analysis Model for Predicting Disease-Diet Associations

dc.contributor.authorToor, Rashmeet
dc.contributor.supervisorChana, Inderveer
dc.date.accessioned2023-07-25T11:04:55Z
dc.date.available2023-07-25T11:04:55Z
dc.date.issued2023-07-25
dc.description.abstractPredictive analytics in healthcare is an integration of computational technologies and healthcare domain for retrieval, storage and analysis of medical data. With the immense progress in computational techniques and technologies, healthcare domain has witnessed unparalleled achievements since the last decade. Comprehending the relationship between health and diet is another such area which presents numerous opportunities for predictive healthcare. Disease-diet associations pose an arduous problem in computational domain because of the evident complex interdependencies. The intertwined relations among disease, diet and their subtypes along with the varying nature of their associations (harmful or helpful) adds to the complexity. Thus, the associations need to be explored with a close integration of significant computational techologies. Predictive analysis of such associations would be an aid for healthcare professionals to foresee the risk of occurrence or progression of a disease on the basis of diet and thus make informed decisions. This work aims to efficiently and effectively predict unknown disease-diet associations using integrated computational technologies. To achieve this, initially, a review of the work done in exploring the relation of disease and diet has been undertaken. It is evident from the review that several studies aim to explore the associations, but they have been designed for a specific disease and diet combination. It is also realized that while some disease-diet associations are well established since ages, there are others which have found acknowledgement only in literature. This presents an opportunity for bringing together such studies and exploiting them on a large scale. Further, a survey of the existing services and techniques designed for understanding relation of disease and other factors like drugs, symptoms etc. has been done. It highlights a plethoric use of an upcoming technology Network Analysis for representing and analyzing complex relationships. Thus, a further investigation of Network Analysis and its role in predictive and healthcare applications has been conducted. Various challenges that are posed while exploiting Network Analysis for healthcare along with measures that might be adopted for overcoming the challenges have also been discussed. Several promising applications of Network Analysis in healthcare domain have been proposed. xvi As an outcome of the survey, Network Analysis is deemed to be a significant technology for exploring disease-diet associations. Considering the complexity of computations involved in this task, there is also a need of a platform which assists effective analysis and does not compromise with performance in case of higher load. Another propitious technology Cloud computing is found to be suitable for this work, given its extensive application in healthcare domain, which has also been reviewed. As a consequence, amalgamation of Network Analysis and Cloud Computing are established as a great fit for exploring disease-diet associations. Data corresponding to disease-diet associations is not available as such, thereby it becomes necessary to extract it from the literature. It is also recognized that visualization of known disease-diet associations in the form of a graph and its quantification offers opportunities for advanced learning. Thus, a Network Model DID-NEM has been proposed and developed for extracting, visualizing and modelling disease-diet associations. Firstly, a custom-made automatic technique DIDACE is utilized to extract and quantify the associations using literature mining. This eliminates the drawbacks of manual curation and assist in fast and efficient extraction. 2,74,131 records containing 1917 different diseases and 143 diet terms have been extracted using this technique. Further, nature of a subset of associations are predicted by performing sentiment analysis using a MLP with an accuracy of 86%. The associations are then transformed into a graph to be readily available for analysis. DID-NEM is novel and can be utilized by domain researchers for extraction of associations between entities other than disease and diet. It also contributes a novel disease-diet associations database to the research community for further study. A prediction approach PredNEM has been proposed to accomplish manipulation and analysis of the curated database. PredNEM aims to firstly quantify and integrate different networks like disease-diet, disease-disease or diet-diet by utilizing different resources including curated database, pattern mining and semantic similarity. Further, two different learning methods, TBM and TFM have been engineered from a combination of network algorithms/parameters and machine learning for prediction of unknown associations. The first method starts by finding communities in the network xvii followed by ranking of nodes for finding most similar nodes, while the second method crafts network algorithms/parameters as features, compares different machine learning algorithms and select the best performing for prediction. Validation of PredNEM and its two learning methods have been demonstrated through two different case studies corresponding to Covid-19 and Inflammatory Bowel Disease (IBD) respectively. Out of top 20 diets predicted for Covid-19, some enthralling associations have been validated through literature including kefir, carrot and strawberry. In the second case study, nature of 16 out of 21 associations has been correctly predicted as per dietician and medical literature for Crohn’s disease using Naive Bayes classifier with ROC AUC value of 82.7%. The predictions enhance traditonal know-how of domain experts and help them to stay updated. These are also beneficial for researchers in further study. Cloud platform has been introduced for provisioning the network based analysis as an efficient and adaptable service to the concerned stakeholders. EC2 and Neo4j instances have been created for deploying the case study of IBD over cloud and connecting to graph database respectively. Results of prediction are provided through an accessible cloud service named CloudMenu. Optimum values of CPU utilization and throughput suggest good performance and better resource utilization. This work contributes significantly by developing a Network model DID-NEM which involves automatic curation of disease-diet associations through DIDACE and its visualization as a network. PredNEM further utilizes network algorithms/parameters and machine learning for advanced analysis. This network based analysis is deployed over cloud, making the service CloudMenu easy to use, flexible and economical.en_US
dc.identifier.urihttp://hdl.handle.net/10266/6521
dc.language.isoenen_US
dc.subjectPredictive analyticsen_US
dc.subjectCloud computingen_US
dc.subjectDiet disease associationen_US
dc.titleCloud Based Network Analysis Model for Predicting Disease-Diet Associationsen_US
dc.typeThesisen_US

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
thesis (2).pdf
Size:
12.25 MB
Format:
Adobe Portable Document Format
Description:
Thesis

License bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
license.txt
Size:
2.03 KB
Format:
Item-specific license agreed upon to submission
Description: