OAR@UM Collection:

OAR@UM Collection: https://www.um.edu.mt/library/oar/handle/123456789/83408 2026-06-04T15:19:29Z 2026-06-04T15:19:29Z Modelling survival data https://www.um.edu.mt/library/oar/handle/123456789/83620 2021-11-09T15:34:46Z 2021-01-01T00:00:00Z

Title: Modelling survival data Abstract: This dissertation investigates several modelling techniques used in survival analysis. One of the chapters provides a comprehensive theoretical review of traditional methods. These include non-parametric methods, including the Kaplan Meier and Nelson Aalen estimators; semi-parametric methods including the Cox proportional hazards model; and parametric methods based on the assumption that the survival distribution has a functional form. The Exponential, Weibull and Gompertz distributions are the most widely used for proportional hazard survival analysis. These traditional methods assume that the population is fairly homogenous and that the variation in survival durations can be explained by a small number of observed explanatory variables. In the presence of heterogeneity, frailty models are more appropriate to model survival data by introducing random effects that account for the variability generated from unobserved covariates. This dissertation presents three chapters on frailty models. The unshared frailty model assumes that different individuals have distinct frailties; and shared frailty models assume that the population comprises clusters, where individuals in the same cluster share the same frailty. Moreover, semi-parametric frailty models extend proportional hazards Cox models by introducing random effects to account for unobserved heterogeneity in the data. The Gamma and the Inverse Gaussian distributions are the most popular choices for the frailty distribution because of their nice mathematical properties. These modelling techniques will be investigated both from a frequentist and a Bayesian approach. One of the chapters in this thesis describes the Bayesian paradigm and sampling methods from the posterior distribution, including the Metropolis-Hasting Algorithm with the Gibbs sampler. One of the benefits of the Bayesian approach is that it allows prior information to be incorporated in a survival model. Another advantage is that MCMC sampling methods enable exact inference for any sample size without relying on any asymptotic properties. All these survival modelling techniques are applied to a data set using the facilities of R and STATA. The participants are patients who underwent an aortic valve replacement procedure at Mater Dei Hospital between 2003 and 2019. The dependent variable is the duration till death or censoring and the eleven explanatory variables provide information about the patients’ health condition; surgery operative procedures; and duration of convalesce period. Moreover, in shared frailty models the patients were clustered by their diabetic condition since it is known that diabetic patients are more at risk of dying following aortic surgery. Description: M.Sc.(Melit.)

2021-01-01T00:00:00Z Penalised alternatives to ordinary least Squares in the Longstaff-Schwartz algorithm for pricing American options https://www.um.edu.mt/library/oar/handle/123456789/83617 2021-11-09T15:34:14Z 2021-01-01T00:00:00Z

Title: Penalised alternatives to ordinary least Squares in the Longstaff-Schwartz algorithm for pricing American options Abstract: One of the most popular techniques for evaluating the American put option is the Longstaff-Schwartz algorithm. In this algorithm, orthogonal polynomials are typically used to estimate the maximum expected future payoff given the current value of the American option. An optimal exercise strategy then ensues for each of these paths, where the average payoff over all paths becomes equivalent to the fair price of the American option. Convergence results have been proven over the years which show that, under certain regularity conditions and using a least squares estimation approach, this average payoff converges in probability to the true price as the sample size of the paths and the order of the orthogonal polynomial go simultaneously to infinity. A number of alternative modelling and estimation approaches have been attempted to make the Longstaff-Schwartz algorithm more accurate and computationally efficient; however a detailed insight at penalised regression methods and how they fare within this context is not found in literature. In this thesis we conduct an empirical assessment of OLS, Ridge, LASSO and Elastic Net estimation to see which of these methods are the best compromise in terms of accuracy and computational efficiency. We compare these methods on three staple processes in finance, namely the Geometric Brownian Motion, Heston Stochastic Volatility and Meixner processes. Furthermore, we use OLS results for large samples and a high number of basis functions as a benchmark for accuracy. Description: M.Sc.(Melit.)

2021-01-01T00:00:00Z Bayesian nonparametric latent feature modelling of river water quality https://www.um.edu.mt/library/oar/handle/123456789/83612 2021-11-09T15:33:32Z 2021-01-01T00:00:00Z

Title: Bayesian nonparametric latent feature modelling of river water quality Abstract: Latent feature modelling is a class of multivariate techniques used to capture hidden structures underlying observed data sets. There are numerous instances in statistical literature where these techniques are applied to data sets, most notably from psychology. However, there is little to no published literature on the application of latent feature models to water quality data sets from scientific fields that attempt to identify and analyse scientific processes affecting our environment. In this study, two different latent feature modelling techniques known as Structural Equation Modelling (SEM) and Bayesian Nonparametric (BNP) latent feature modelling are fitted to a collection of data related to water quality. The former technique takes a finite-dimensional classical Frequentist approach, where the number of latent features extracted by the model is fixed and must be specified beforehand. In addition, hierarchical data is catered for by introducing a multi-level extension of SEM known as MSEM. However, these models encounter various problems when applied to larger, more complex data sets. To work with a more flexible and wider ranging family of models, we transition to infinite dimensions through the use of BNP models, which are the main protagonists of this study. In contrast to SEM, these models allow the number of latent features to be open-ended. The applicability of this technique to river water quality is determined by fitting a linear-Gaussian binary latent feature model using a Dirichlet Process (DP) prior. Studying the resulting posterior distribution allows us to identify and explain the possible sources affecting the water quality of rivers over time. Information on the provenance and timing of the observations comprising the data set shall be purposely left out during the estimation to enable us to test whether BNP models are able to detect, through the use of information contained within the data, inherent factors which are not measured directly but inferred statistically. Description: M.Sc.(Melit.)

2021-01-01T00:00:00Z Spatial Bayesian hierarchical modelling of functional Magnetic Resonance Imaging (fMRI) data https://www.um.edu.mt/library/oar/handle/123456789/83606 2021-11-09T15:32:49Z 2021-01-01T00:00:00Z

Title: Spatial Bayesian hierarchical modelling of functional Magnetic Resonance Imaging (fMRI) data Abstract: Functional magnetic resonance imaging (fMRI) is a technique that measures changes in blood oxygenation in the brain as a result of a stimulus. This technique provides insight into the vast hidden structures of the brain. The aim of this study was to obtain activation amplitudes and regions of activation in the brain for the motor tasks: visual cue, left and right hand, left and right foot, and tongue. Cortical surface (cs)-fMRI data was used for the analysis, gathered from 10 adults from the Human Connectome Project (HCP). A spatial Bayesian general linear model (GLM) was implemented to obtain estimates of activation amplitudes at single and multiple subject levels. Two methods were employed for the multiple subject analysis: the joint and the two-level approach. The integrated nested Laplacian approximation (INLA) technique was used for Bayesian computation. The regions of activation were identified with the use of joint posterior probability maps (PPM), thresholded at different levels. The results obtained were illustrated as figures of inflated brains to aid in the visualisation. Verification of the accuracy of the results was achieved by comparing these figures to the literature on the physiology of the brain. The single and multi-subject Bayesians GLMs depicted accurate activation amplitudes. The two-level approach for the group-level analysis produced smoother activation amplitude estimates than the joint approach. Moreover, the threshold level 𝛾 = 1% illustrated the most targeted activations in the regions associated with the tasks. Accuracy in the results and the computational efficiency of the method suggest that using a Bayesian approach to account for spatial dependencies in cs-fMRI motor task studies is advantageous. Description: B.Sc. (Hons)(Melit.)

2021-01-01T00:00:00Z