Please use this identifier to cite or link to this item: https://www.um.edu.mt/library/oar/handle/123456789/24431
Title: Discriminating between two groups using eigenvectors
Authors: Buhagiar, Anton
Keywords: Proof theory
Mathematics -- Periodicals
Issue Date: 2003
Publisher: University of Malta, Department of Mathematics
Citation: Buhagiar, A. (2003). Discriminating between two groups using eigenvectors. The Collection, 8, 22-34.
Abstract: Consider g populations or groups, g 2≥ 2. The object of discriminant analysis is to allocate an individual to one of these g groups on the basis of his/her measurements on the p variables x1, x2……. xp. It is desirable to make as few 'mistakes' as possible in classifying these individuals to the various groups. For example, the populations might consist of different diseases and the p variables x1, x2, ... ,xp might measure the symptoms of a patient, eg. blood pressure, body temperature, etc. Thus one is trying to diagnose a patient's disease on the basis of his/her symptoms. As another example, one can consider samples from three species of iris. The object is then to allocate a new iris to one of these species on the basis of its measurements eg. sepal length, sepal width, etc. In the case of two groups, g = 2, in the univariate case, when p = 1 and x1 is the only variable measured, it is quite easy to see when the two groups are well separated from each other. For this purpose, one can perform a t-test on x1 to see whether the two groups have significantly different means. Equivalently, one can define the ratio: the difference between the means of the two samples / deviations within the samples. A large value for this ratio, which is proportional to the t-statistic, would indicate that the means of the samples are well separated from each other; conversely, a small value for this ratio would imply that within sample variations are relatively large, and that readings from the two samples would tend to overlap. This would in turn lead to poor discrimination between the two groups in terms of x1, and to a non-significant difference between the sample means for x1. In the case when g ≥ 2, that is for two or more groups, and when p =1, one-way analysis of variance, the F-test, can be performed to examine whether the mean of x1 differs significantly over the groups. Equivalently, one can define the ratio: Variation between the means of the samples / Variation within the samples. Again in this case, a large value for this ratio, which is closely related to the F-statistic, signifies good separation between the groups and a significant difference for x1 between the groups. In fact, in the case of two groups (g = 2), the F-test and the t-test are equivalent to each other, with F = t2 for a given problem. In the case when the number of variables is larger than one, p > 1, one can perform separate univariate tests on each of the p variables x1, x2, ... , xp. For purposes of discrimination, however, it is often preferable to define a linear combination y of the xk’s, namely y=∑_(k=1)^p▒〖a_k x_k 〗 with the object of maximising the ratio defined in equation. Finding the best linear combination which maximizes this ratio is equivalent to maximizing the statistical distance between the groups. This in turn would guarantee greater success in discriminating between the different groups. As shown below, the problem of finding the optimum choice of the coefficients ai can be reduced to a suitable eigenvalue problem.
URI: https://www.um.edu.mt/library/oar//handle/123456789/24431
Appears in Collections:Collection, No.8
Collection, No.8

Files in This Item:
File Description SizeFormat 
Discriminating between two groups using eigenvectors.pdf416.76 kBAdobe PDFView/Open


Items in OAR@UM are protected by copyright, with all rights reserved, unless otherwise indicated.