Please use this identifier to cite or link to this item: https://www.um.edu.mt/library/oar/handle/123456789/93757
Title: Modelling the average gross annual salary using two-way analysis of variance
Authors: Tanti, Ronald (2002)
Keywords: Statistics
Analysis of variance
Labor supply -- Malta
Issue Date: 2002
Citation: Tanti, R. (2002). Modelling the average gross annual salary using two-way analysis of variance (Bachelor's dissertation).
Abstract: Analysis of Variance (ANOVA) is used to uncover the main and interaction effects of categorical independent variables called 'factors' on an interval dependent variable. Basically there are two main types of ANOVA that are one-way and two-way ANOVA. In one-way ANOVA only one factor is considered while in two-way ANOVA, two factors are considered at the same time and also that the two factors can interact with each other. During the last two hundred years there were several persons such as Legendre, Pearson, Scheffe and others who gave their contribution to the development of ANOVA. The corresponding theory of two-way ANOVA is given and the major results are proven. The data that is used as the basis for modelling consists of the average gross annual salary for employees by main occupation such as professionals, clerks and others. It consists of two factors gender and occupation. The latter, consists of eight different occupations (levels), classified according to the ISCO (International Standard Classification of Occupations) classification. The data was obtained from the Labour Force Survey held by the National Statistics Office during May and December 2000 and March and June 2001 and it can be seen in Appendix A. The scope of the dissertation is to obtain a suitable model for the above data by using two-way ANOVA. In the process of modelling, the following questions are answered: • Does the average salary of a particular occupation depend on the gender? • If it does, is the occupation a significant factor? • Is there any interaction between the occupation and gender? Gender and occupation significantly influence the average salary. In fact, male employees have larger average gross annual salary than female ones and occupations iii Abstract such as legislators, senior officials & managers and professionals yields greater salary than clerks, service workers and shop and sales workers and others. However the interaction term is not significant. A suitable model from the parameter estimates of the significant factors is as follows: where Salary= 2661+1113G1 +409501 +285302 +173503 + 91604 + 40405 + 63306 + 71707 G1 Male G2 Female - it is aliased 01 Legislators, Senior Officials & Managers 02 Professionals 0 3 Technicians & Associate professionals 04 Clerks 0 5 Service workers & Shop & Market sales workers 06 Craft & Related trades workers 0 7 Plant & Machine operators & assemblers 0 8 Elementary occupations - it is aliased The parameter estimates given by GLIM and SPSS differ. The reason for this is that in SPSS the last term of each factor is aliased whereas in GLIM the first term is aliased. Does the model fits the data well? Does general assumptions such as the normality of the error terms and the linearity assumption are satisfied? This is done by inspecting: > Residuals - Pearson, standardized, and studentized residuals based on the fitted values of the model obtained were extracted. There is no curvature in the residual plots implying that the linear model assumption is satisfied. The residual plots indicate that there is an outlier because the absolute residual of this data point exceed 3. The outlier does not have much influence on the model as was shown from the cook's distances. The outlier is the data point LM5398 and corresponds to the annual gross salary of female employees, which fall under legislators, senior officials & managers as occupation. The model is indicating that this data point is deflated. In the data (Table 1), there is a large dispersion in the average gross annual salary for the first two occupations that are legislators, senior officials & managers and professionals. The reason could be because such employees does not have a fixed salary as clerks, shop workers and others. Consequently, there could be individuals who do not declare their actual gross annual salary. Also the residuals are normally distributed and this was proved by using Q-Q Plot and Kolmogorov Smirnov test. ~ Leverages - check how far the x-value of the data point is away from the average (centroid) of the rest of the x-values. All the data points have the same leverage and therefore they are equidistant from the centroid. ~ Cook's distances - checks the influence certain data points have on the fitted model. The outlier does not have much influence on the model.
Description: B.SC.(HONS)STATS.&OP.RESEARCH
URI: https://www.um.edu.mt/library/oar/handle/123456789/93757
Appears in Collections:Dissertations - FacSci - 1965-2014
Dissertations - FacSciSOR - 2000-2014

Files in This Item:
File Description SizeFormat 
BSC(HONS)STATISTICS_Tanti_ Ronald_2002.PDF
  Restricted Access
3.73 MBAdobe PDFView/Open Request a copy


Items in OAR@UM are protected by copyright, with all rights reserved, unless otherwise indicated.