Please use this identifier to cite or link to this item:
Title: Exploration and reduction of data using principal component analysis
Authors: Buhagiar, Anton
Keywords: Big data
Variables (Mathematics)
Principal components analysis
Issue Date: 2002
Publisher: Malta Medical Journal
Citation: Malta Medical Journal. 2002, Vol.14(1), p. 27-35
Abstract: In a data set with two variables only, a scatterplot between the two variables can be easily plotted to represent the data visually. When the number of variables in the data set is large, however, it is more difficult to represent visually. The method of principal component analysis (PCA) can sometimes be used to represent the data faithfully in few dimensions (eg. three or less), with little or no loss of information. This reduction in dimensionality is best achieved when the original variables are highly correlated, positively or negatively. In this case, it is quite conceivable that 20 or 30 original variables can be adequately represented by two or three new variables, which are suitable combinations of the original ones, and which are called principal components. Principal components are uncorrelated between themselves, so that each component describes a different dimension of the data. The principal components can also be arranged in descending order of their variance. The first component has the largest variance, and is the most important, followed by the second component with the second largest variance, and so on. The first two components can then be evaluated for each case in the data set and plotted against each other in a scattergraph, the score for the first component being plotted along the horizontal axis, the score of the second component being plotted on the vertical axis. This scatterplot is a parsimonious two-dimensional picture of the variables and cases in the original data set. We illustrate the method by applying it to simulated datasets, and to a dataset containing national track record times for males and females in various countries.
Appears in Collections:MMJ, Volume 14, Issue 1
MMJ, Volume 14, Issue 1
Scholarly Works - FacSciMat

Files in This Item:
File Description SizeFormat 
2002.Vol14.Issue1.A5.pdfExploration and reduction of data using principal component analysis138.24 kBAdobe PDFView/Open

Items in OAR@UM are protected by copyright, with all rights reserved, unless otherwise indicated.