Please use this identifier to cite or link to this item: https://www.um.edu.mt/library/oar/handle/123456789/8618
Title: Challenges of indexing multi-dimensional persistent data
Authors: Zahra, Rebecca
Keywords: Database management
Meteorology -- Malta
SQL (Computer program language)
Issue Date: 2015
Abstract: There is an exponential growth in the demand for high dimensional data; and example of which is spatial data. The increase in this type of data pushes for a different solution some kind of solution to be able to retrieve it efficiently compared to single dimensional data. Indexes are one of the main options which can help in efficient data retrieval. However the selection of an appropriate index for a specific use case is a complex task and only approximate solutions can be attained. This is mainly due to the variety of factors which effect the performance of an indexing structure and its costs. These include namely data characteristics, types of queries and memory parameters. Environmental weather data is considered as the main use case throughout this research. A domain expert from the Maltese meteorological office, Mr J. Schiavone, identified the generic datasets required and the main operational and tactical queries involved in meteorology with a local context. This expertise provided the basis for the selection of adequate meteorological data sets and how queries need to be developed. The analysis process involved the execution of many queries on raster and vector data. The performance of such queries was evaluated before and after indexing structures were introduced. Besides, additional adjustments such as the usage of partial or expressional indexes and other memory tweaking were taken into account. To ensure that all queries are treated equally the data server was restarted before every query. After considering the above factors a top-down, holistic approach is adopted to select appropriate indexes for meteorological queries based on the previous analysis evaluation and a cost benefit analysis. Although query optimisation per statement might be used the procedure adopted for this research was adopted for the set of data queries as a whole. The analysis showed that particular indexes are more targeted towards particular query types. Moreover, the functions chosen when formulating an SQL query are vital as there are certain functions which do not make use of indexes. It was noted that the raster format was more suitable with restricted range locations (i.e from vector geometries). Furthermore, provided that queries are focused on particular areas, indexing a subset rather than the whole data set was considered beneficial.
Description: B.SC.IT(HONS)
URI: https://www.um.edu.mt/library/oar//handle/123456789/8618
Appears in Collections:Dissertations - FacICT - 2015

Files in This Item:
File Description SizeFormat 
15MSCIT004.pdf
  Restricted Access
4.84 MBAdobe PDFView/Open Request a copy


Items in OAR@UM are protected by copyright, with all rights reserved, unless otherwise indicated.