Please use this identifier to cite or link to this item:
Title: Mining massive time series data : with dimensionality reduction techniques
Authors: Borg, Justin
Keywords: Data mining
Dimension reduction (Statistics)
Issue Date: 2017
Abstract: Researchers have advocated that a pre-processing step is needed before applying massive time series data to high computational applications such as data mining algorithms, even if this introduces a reduction in the quality and nature of the original time series data. During the last two decades various time series dimensionality reduction techniques have been proposed in the literature to serve as a pre-processing step; one of these is numerosity reduction. Numerosity reduction gives excellent response time on complex data mining algorithms when comparing the same process over the raw time series. However no study have been dedicated to compare these time series dimensionality reduction techniques in terms of their effectiveness of producing a good representation that when applied to various data mining algorithms produces accurate results. The study selected nine well known times series datasets, applied four reduction techniques with five levels of reductions, and five knowledge extraction techniques also well known for time series mining. For each permutation we applied post processing evaluation metrics and produced an average accuracy level when compared to the same time series and data mining procedure on the raw time series. It has been shown that the Piecewise Aggregate Approximation (PAA) is able to produce results with more than 70% accuracy when applied to a partitional, hierarchical and classification algorithm. Furthermore, a Symbolic Aggregate Approximation (SAX) representation with a larger alphabet size produces higher accurate results than a SAX representation with a smaller alphabet size. On the other hand, the Discrete Wavelet Transform (DWT) produces results of lower accuracy than both the PAA and SAX techniques. Results have indicated a change in results’ accuracy levels from clustering and classification algorithms to the motif and discord discovery algorithms. The former algorithms produced results of higher accuracy than the latter algorithms.
Description: M.SC.IT
Appears in Collections:Dissertations - FacICT - 2017
Dissertations - FacICTCIS - 2017

Files in This Item:
File Description SizeFormat 
  Restricted Access
2.3 MBAdobe PDFView/Open Request a copy

Items in OAR@UM are protected by copyright, with all rights reserved, unless otherwise indicated.