Learning DFAs from sparse data

Please use this identifier to cite or link to this item: https://www.um.edu.mt/library/oar/handle/123456789/85570

Title:	Learning DFAs from sparse data
Authors:	Guillaumier, Kristian (2020)
Keywords:	Computational learning theory Algorithms
Issue Date:	2020
Citation:	Guillaumier, K. (2020). Learning DFAs from sparse data (Doctoral dissertation).
Abstract:	Regular inference is the task of identifying a Deterministic Finite-State Automaton (DFA) from a training set of positive and negative strings of a regular language. This problem is known to become increasingly more difficult as the size of the target DFA grows larger and as the training data becomes sparser. One of the most studied algorithms for this task is the Evidence-Driven State Merging (EDSM) algorithm due to Rodney Price which emerged as a winning algorithm in the international Abbadingo One competition of 1997. We focus on ‘Abbadingo-style’ problems (learning DFAs over binary alphabets with training sets of various densities), and we present the results of a comprehensive analysis of how, and more importantly when, EDSM succeeds or fails in identifying either the exact target DFA, or a low-error hypothesis with respect to a test set. We do this by creating thousands of problem instances to study their characteristics, as well as the behaviour of the state merging process. To support this analysis, we have developed an extensive software framework consisting of highly optimised, and parallelised, implementations of state merging algorithms, as well as several visual and statistical analysis tools. Motivated by the results and insights we obtained following this analysis, we propose three methods each having the aim of improving on the generalisation rate of EDSM on Abbadingo-style problems when the training data is sparse. Our first method involves the development of an ensemble of monotonic, greedy heuristics which, together, are able to outperform EDSM. This method is inspired by Wolpert and Macready’s No Free Lunch theorems which roughly state that, in general, no single heuristic will dominate all others over all problem instances. This is indeed supported by the empirical evidence we have gathered during our experimentation on various problem configurations. Associated with the ensemble of heuristics, we have also identified a method which enables us to predict, with a high degree of confidence, which of the individual heuristics in the ensemble results in a low or zero-error hypothesis. Our second approach, which we call the Delta Graph method, is based on the observation that when a greedy heuristic selects sequences of merges, the initial ones are especially critical. When a wrong choice is made, finding the target DFA becomes impossible and the likelihood of finding a low-error hypothesis will be greatly reduced. This method involves constructing and non-monotonically searching in a structure representing a highly condensed subspace of possible merges. This structure contains several short sequences of merges, where, with high experimental probability, at least one of them will consist of correct merges leading to the target. These merges establish enough constraints on a partial hypothesis that, when extended with a label-matching heuristic, will lead to the target DFA or a low-error hypothesis. Typical evolutionary approaches in DFA learning operate by attempting to evolve a target DFA either as a transition matrix or by partitioning the states of a Prefix Tree Acceptor (PTA). In our third method, we present an alternative approach which, instead, evolves short sequences of merges selected from a subset of high state-reduction merges. As in the Delta Graph method, these short sequences of merges establish enough constraints on a hypothesis, that when extended with a label-matching heuristic, will, with high experimental probability, lead to the target DFA or a low-error hypothesis. To ensure a common baseline for comparison, our methods are evaluated on target DFAs and training sets which have been constructed according to the Abbadingo One competition procedures. Our results show that each of the methods we have developed outperforms EDSM. For example, on 64-state target DFA problems and symmetrically structurally complete training sets at the sparsest density set by the Abbadingo One competition, while EDSM identifies low-error DFAs approximately 15% of the time, our ensemble, Delta Graph, and evolutionary methods do so about 26%, 43%, and 56% of the time respectively. We also obtain considerably better generalisation rates on problem instances which are highly adversarial to EDSM.
Description:	PH.D.ARTIFICIAL INTELLIGENCE
URI:	https://www.um.edu.mt/library/oar/handle/123456789/85570
Appears in Collections:	Dissertations - FacICT - 2020 Dissertations - FacICTAI - 2020

Files in This Item:

File	Description	Size	Format
20PHDIT001.pdf		5.76 MB	Adobe PDF	View/Open

Show full item record Statistics