Please use this identifier to cite or link to this item: https://www.um.edu.mt/library/oar/handle/123456789/63167
Title: Penalized logistic regression for classification : the LASSO approach
Authors: Azzopardi, Annalise
Keywords: Logistic regression analysis
Econometric models
Least squares
Issue Date: 2020
Citation: Azzopardi, A. (2020). Penalized logistic regression for classification: the LASSO approach (Master's dissertation).
Abstract: Logistic Regression (LR) is a very popular multivariate statistical techniques used to model data with a qualitative response variable. Originally LR was built to find the parsimonious model, that is the model that best describes the relationship which may exists between the dependent variable and a set of explanatory variables, today it is also being used for classification purposes. Throughout the years, the number of data collected has become larger in size, where one end up with having more explanatory variables than entities, especially in the fields of genetics and biomedical science. Such datasets are known as high-dimensional. Estimation techniques, such as, the maximum likelihood estimation (MLE) tend to perform poorly when applied on high-dimensional datasets. Also the MLE estimator tend to perform poorly when the data is characterized by the following scenarios: i) multicollinearity, and ii) separation. Thus, regularization techniques were introduced. Amongst the many regularization techniques that exists, is the Least Absolute Shrinkage and Selection Operator (LASSO). The beauty of the LASSO estimator is that is can be used both for shrinkage of the parameter estimates and variable selection. For LR the LASSO modifies the MLE by adding the `1−norm of the unknown parameters to the negative log-likelihood, so it turns a maximization optimization problem into a minimization optimization problem. In this dissertation, we will explain how to solve the minimization optimization problem using an optimization algorithm. One of the greatest challenges in LR with the LASSO is to find the optimal shrinkage parameter, since we want to get the most accurate parameter estimates, and the best subset of explanatory variables in the model. The methods used to find the optimal value of the shrinkage parameter are presented. The predictive ability of the LR with the LASSO will be analyzed by implementing it on some real life datasets. Various validation techniques will be used as well.
Description: M.SC.STATISTICS
URI: https://www.um.edu.mt/library/oar/handle/123456789/63167
Appears in Collections:Dissertations - FacSci - 2020
Dissertations - FacSciSOR - 2020

Files in This Item:
File Description SizeFormat 
20MSCSTAT001.pdf
  Restricted Access
2.23 MBAdobe PDFView/Open Request a copy


Items in OAR@UM are protected by copyright, with all rights reserved, unless otherwise indicated.