Please use this identifier to cite or link to this item: https://www.um.edu.mt/library/oar/handle/123456789/25820
Title: Multi-dimensional indexes in DBMSs
Authors: Mercieca, Thomas
Keywords: Database management
SQL (Computer program language)
R (Computer program language)
Issue Date: 2017
Abstract: Higher-dimensional data and operations are becoming a necessity across multimedia and data mining that often times work in situations where storage is limited and execution time needs to be as low as possible. In order to handle this type of data with effective access methods, an index structure is required to help efficient retrieval. An access method that promises these advantages is the R-Tree, which DBMSs are implementing as core functionality for offering efficient retrieval for multidimensional data. The R-Tree has a number of characteristics, which influence the quality of its structure and the resulting retrieval through it. Searching an R-Tree is a multipath problem and the gap between the best and worst-case performance is very wide, making quality R-Trees building especially important. There are a number of approaches designed to address this and many techniques apply when overflowing nodes occur. In this project, we investigate the technique PostgreSQL uses for its R-Tree implementation through the GiST interface, and focus on a specific parameter as an optimisation target to our workload. We focus on a parameter called LIMIT_RATIO that controls the distribution of entries in a node after an overflow. The difficulty behind controlling this parameter is that it is hard-wired into the DBMS. In this work, we unwind the DBMS engine code to allow this parameter to become accessible through the SQL construct for definition by the user. Code additions to PostgreSQL are checked through its extensive regression test suites. Then, we study its impact with respect to different metrics over an extensive and well known spatial dataset and consequently able to propose a set of optimal values for LIMIT_RATIO. As a result of this project’s implementation, data designers now have a more flexible and configurable set-up of multidimensional data indexes.
Description: B.SC.IT(HONS)
URI: https://www.um.edu.mt/library/oar//handle/123456789/25820
Appears in Collections:Dissertations - FacICT - 2017

Files in This Item:
File Description SizeFormat 
17BITSD026.pdf
  Restricted Access
2.19 MBAdobe PDFView/Open Request a copy


Items in OAR@UM are protected by copyright, with all rights reserved, unless otherwise indicated.