Please use this identifier to cite or link to this item:
http://hdl.handle.net/10266/5558
Title: | Machine Learning based Framework for Drug Prediction of Cancerous Genomic Profiles |
Authors: | Sharma, Aman |
Supervisor: | Rani, Rinkle |
Keywords: | Drug Target Interaction Prediction,;Ensemble Learning,;Drug Response Prediction.;Dimensionality Reduction, Active Learning, Bagging, Gene Expression, Cancer Classification, Drug Synergy Prediction, |
Issue Date: | 2-Aug-2019 |
Abstract: | Advancement in bioinformatics has raised the patient’s life expectancy and boosted the treatment procedure of various stringent diseases. Cancer is one of the genetic diseases caused due to mutation and variation in genes of the patient’s cells. Complexity in tumor microenvironment makes cancer difficult disease from the treatment perspective. Patients with the same type of cancer show heterogeneous treatment responses toward the same type of targeted therapies. Clinical trials and the traditional drug discovery process is a time-consuming and tedious task. Hence, researchers are trying their hard to design optimal treatment options for such stringent diseases. Availability of huge amount of oncological and pharmacogenomics online data sources have boosted the research in this field. Recently data mining and machine learning approaches are adding a powerful hand in such a data-driven analysis. In this thesis, we have mentioned diverse areas of personalized cancer therapy using predictive modeling. We have worked in diverse areas of precision medicine such as drug response prediction, drug synergy prediction, drug target-interaction prediction and cancer classification using machine learning approaches. The main objective of this research is to design prediction models for drug sensitivity prediction, drug combination therapy, drug target interaction prediction and cancer classification using machine learning. A cancer classification framework C-HMOSHSSA is proposed using multi-objective meta-heuristic and machine learning approaches to predict relevant and new cancer biomarkers. A hybrid feature selection algorithm (HMOSHSSA) is proposed for gene selection using multi-objective spotted hyena optimizer (MOSHO) and salp swarm algorithm (SSA). Further, four different classifiers are trained on the dataset which is obtained after applying the proposed hybrid gene selection algorithm (HMOSHSSA). The new sets of informative genes are identified by the xiii proposed technique. Next, we have proposed an integrated framework for the identification of effective and synergistic anti-cancer drug combinations. In this, we have proposed an integrated methodology for drug synergy prediction based on features extracted from single drug response values. Different machine learning models are trained on extracted features. "Random Forest" outperforms all other models. The proposed approach is applied to mutant-BRAF melanoma and further validated using melanoma cell-lines from AstraZeneca-Sanger Drug Combination Prediction DREAM Challenge dataset. In addition to above-mentioned work, the kernelized similarity based regularized matrix factorization framework (KSRMF) is also proposed for predicting anti-cancer drug responses. The proposed framework is based on assumption that similar drugs exhibit similar drug responses. Drug-Drug chemical structure similarity and Tissue-Tissue similarity (gene expression) are taken as key descriptors to formulate the objective function. The kernel function is used to map non-linear relationships between drugs and tissues. The proposed framework is validated using publicly available tumor datasets: GDSC and CCLE. Proposed KSRMF is further compared with three state-of-art algorithms using GDSC and CCLE drug screens. We have also predicted missing drug response values in the dataset using KSRMF. An ensemble framework BE-DTI’ is proposed for drug target interaction prediction using dimensionality reduction and active learning. Active learning helps to improve under-sampling bagging based ensembles. Dimensionality reduction techniques are used to deal with high dimensional data. The performance of the proposed framework is compared with five existing (Random Forest (RF), Support Vector Machine (SVM), Yu et al. [1], Ezzat et al. [2])feature-based approaches. |
URI: | http://hdl.handle.net/10266/5558 |
Appears in Collections: | Doctoral Theses@CSED |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
Aman-Sharma-Thesis.pdf | 3.17 MB | Adobe PDF | View/Open Request a copy |
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.