Get Complete Project Material File(s) Now! »
INTRODUCTION
Computational problem-solving often relies on modelling that is based on finite automata, and that subsequently requires the need to manipulate automata symboli- cally or concretely. By the symbolic manipulation of an automaton is meant here, the application of various classical operations on the automaton, thereby transforming it with the aim at producing another automaton for further usage. Examples of such symbolic manipulation include constructing the automaton from a regular expression, determinizing a non-deterministic automaton, minimizing an automaton, etc.
The Problem
In practice, the performance of an automaton-based string recognizer may not always depend on the length of the string being tested for acceptance as it is the case in theory. Factors such as the automaton’s alphabet size, its number of states and of course its layout (its sparsity and its density) are of importance when dealing with efficiency. On top of all these factors, is the computational medium to be used for processing. In effect, although the performance of any algorithm be it very efficient in theory, may in practice be hampered by the hardware being used for processing.
The Kind of Automata
Throughout this thesis, we rely on Deterministic Finite Automata (DFA). Processing DFAs to determine string ownership is straightforward. Although string acceptance testing is also possible with non deterministic automata, we have deliberately chosen to restrict ourselves to DFAs. Therefore throughout this thesis, the various usage of the word Finite Automaton (FA) is in fact meant for Deterministic Finite Automaton.
Objective of the thesis
This work aims to exploit the principle of locality of reference on which optimal cache memory usage is based, in order to suggest alternative algorithms to the so-called conventional TD and HC algorithms. In order to achieve this, various implementation strategies that could be applied to TD and HC are suggested, resulting therefore in new algorithms. It is shown that, using a formal definition of a string recognizer to which implementation strategies are associated as variables, a reservoir of algorithms could be suggested. The suggested algorithms are then classified in a taxonomy tree based on a predefined relationship between them. The tree forms
ABSTRACT
ACKNOWLEDGEMENTS
1. INTRODUCTION
1.1 The Problem
1.2 The Kind of Automata
1.3 Objective of the thesis
1.4 Methodology
1.5 Thesis Outline
2. PRELIMINARIES
2.1 Introduction
2.2 Formal elements of string processing
2.3 Operation of superscalar processors and performance metrics
2.3.1 Basic principles of computer architecture
2.3.2 Operation of a superscalar processor
2.3.3 Operational diagram of a superscalar processor
2.3.4 Performance metrics identification
2.4 Summary of the chapter
II FA-based String Processing Algorithms
3. CORE ALGORITHMS
3.1 The table-driven algorithm
3.2 The hardcoded algorithm
3.2.0.1 An Example
3.3 The mixed-mode algorithm
3.4 Functional description of core recognizers
3.5 Summary of the Chapter
4. DYNAMIC STATE ALLOCATION
4.1 DSA Characterization
4.1.1 Properties of the DSA strategy
4.2 The TD-DSA algorithm
4.3 The HC-DSA Algorithm
4.4 The MM-DSA Algorithm
4.5 Illustrative Example
4.6 A theoretical assessment
4.7 Summary of the Chapter
5. STATES PRE-ORDERING
5.1 The SpO-based Characterization
5.2 Properties of the SpO strategies
5.3 The TD-SpO algorithm
5.4 The HC-SpO algorithm
5.5 The MM-SpO algorithm
5.6 Theoretical Assessment
5.7 Summary of the Chapter
6. ALLOCATED VIRTUAL CACHING
6.1 The AVC-based Characterization
6.2 Properties of the AVC strategies
6.3 The Table-driven-AVC algorithm
6.4 The hardcoded-AVC algorithm
6.5 The mixed-mode-AVC algorithm
6.6 Illustrative example
6.7 Theoretical assessment
6.8 Summary of the Chapter
7. TAXONOMY OF FA-BASED STRING PROCESSORS
7.1 Related Work
7.2 New Characterization of FA-based String Processors
7.2.1 Derivation of the TD algorithms
7.2.1.1 The TD-SpO-AVC algorithm
7.2.1.2 The bounded TD-DSA-SpO algorithm
7.2.1.3 The bounded TD-DSA-SpO-AVC algorithm
7.2.1.4 The bounded TD-DSA-AVC algorithm
7.2.1.5 The unbounded TD-DSA-SpO algorithm
7.2.1.6 The unbounded TD-DSA-SpO-AVC algorithm
7.2.1.7 The unbounded TD-DSA-AVC algorithm
7.2.2 Derivation of the hardcoded algorithms
7.2.2.1 The HC-SpO-AVC algorithm
7.2.2.2 The bounded HC-DSA-SpO algorithm
7.2.2.3 The bounded HC-DSA-SpO-AVC algorithm
7.2.2.4 The bounded HC-DSA-AVC algorithm
7.2.2.5 The unbounded HC-DSA-SpO algorithm
7.2.2.6 The unbounded HC-DSA-AVC algorithm
7.2.2.7 The unbounded HC-DSA-SpO-AVC algorithm
7.2.3 Derivation of the mixed-mode algorithms
7.3 The taxonomy
7.4 Summary of the Chapter
8. TOOLKIT DESIGN FOR FA-BASED STRING RECOGNIZERS
8.1 Motivation and Related Work
8.2 The Architectural Design
8.2.1 The Package PkgRecognizer
8.2.2 The Package PkgTableDriver
8.2.2.1 The first level of the TD class-diagram
8.2.2.2 The second level of the TD class-diagram
8.2.2.3 The last level of the TD class-diagram
8.2.3 The Package PkgHardCoder
8.2.4 The Package PkgMixedModer
8.2.5 A Detailed Toolkit’s Architecture
8.3 Summary of the Chapter
III Performance of DFA-based String Processors
9. INTRODUCTION AND MOTIVATIONS
9.1 The Software and Hardware Context
9.2 Performance Measurement
9.3 Summary of the Chapter
10.EXPERIMENTAL RESULTS
10.1 Introduction
10.2 The Table-driven Experiments
10.3 The Hardcoded Experiments
10.4 The Mixed-mode Experiments
10.5 Summary of the Chapter
IV Epilogue: Conclusion and Future Work
11.SUMMARY AND CONCLUSION
11.1 Summary
11.2 Conclusion
12.FUTURE WORK
12.1 Projects of limited scale
12.2 Medium-scale projects
12.3 Advanced research projects
12.4 End note
BIBLIOGRAPHY
GET THE COMPLETE PROJECT
TOWARDS CACHE OPTIMIZATION IN FINITE AUTOMATA IMPLEMENTATIONS