Please use this identifier to cite or link to this item: https://www.um.edu.mt/library/oar/handle/123456789/23545
Title: Building machine learning models to explore protein-ligand interactions for drug discovery
Authors: Catania, Daniel
Keywords: Ligand binding (Biochemistry)
Protein binding
Proteins
Ligands
Pharmaceutical chemistry
Machine learning
Issue Date: 2017
Abstract: Computational methods have become increasingly popular in drug discovery. In recent years, such methods have been used to aid physical experiments during the early stages of the drug discovery process, in order to reduce expense and time. Despite the usage of these methods, only a small increase in new molecular entities, over recent years, has been witnessed. The aim of this nal year project was to gain a better understanding of protein-ligand interactions, and to build machine learning models based on this understanding. These computational models help us identify small-molecules (also known as ligands) which interact with proteins, a necessary requirement for most medicinal drugs. These small-molecules interact, or bind, with proteins at some strength (binding a nity). This binding a nity is measured experimentally using three alternate measures (IC50, Ki and Kd). Presently, the only existing database of protein-ligand interactions is not publicly available, so we set out to build a database by mining and ltering proteins, bound with ligands, from the Protein Data Bank, in order to extract their interacting features. We augmented this database with experimental binding a nity data from other sources. We then built machine learning models to predict the binding a nities based on interactions between a protein and a ligand. We explored several machine learning approaches, including Nearest Neighbours, Support Vector Machines, Random Forest and Arti cial Neural Networks to find the best model which describes this binding relationship. While acknowledging that predicting binding a nity based on only feature interactions is hard, we also concluded that the model selection depended on the type of binding affinity measure. The protein-ligand interactions together with their respective binding affinity data have been made publicly available on Github and may allow for further research in this area.
Description: B.SC.(HONS)COMP.SCI.
URI: https://www.um.edu.mt/library/oar//handle/123456789/23545
Appears in Collections:Dissertations - FacICT - 2017
Dissertations - FacICTCS - 2017

Files in This Item:
File Description SizeFormat 
17BCS012.pdf
  Restricted Access
2.7 MBAdobe PDFView/Open Request a copy


Items in OAR@UM are protected by copyright, with all rights reserved, unless otherwise indicated.