52 lines
2.3 KiB
Markdown
52 lines
2.3 KiB
Markdown
What is DEES and what does it do?
|
|
|
|
DEES is a is a _Multiplicity Automata_ (MA) inference algorithm.
|
|
This C++ program is about 7,500 lines (10,000 with comments).
|
|
The theory behind this algorithm can be found in the following papers:
|
|
|
|
- [Learning rational stochastic languages (COLT 2006)](http://yann.esposito.free.fr/pub/colt2006.pdf)
|
|
|
|
A Multiplicity Automaton can be seen as a generalisation of Hidden Markov Models (HMM). See this [paper for more details](http://yann.esposito.free.fr/pub/Links_PA_HMM.pdf).
|
|
|
|
So mainly DEES is an algorithm that learn both the parameters and the structure of HMM.
|
|
And for that it doesn't use an euristic but properties proven to converge.
|
|
In fact DEES can generate HMM but also more generic models.
|
|
These models are the Multiplicity Automata (mainly, imagine an HMM with some parameter being able to be negative).
|
|
|
|
It takes a sample of many sequences (or words) generated by a target probability distribution and return a model (a multiplicity automaton) generating a probability distribution as close as possible of the target distribution.
|
|
|
|
We can restric the learned model to be:
|
|
|
|
- a Multiplicity Automaton
|
|
- a Probabilistic Automaton (PA) (another name for Hidden Markov Models - HMM) ; in this case the identified class is the set of Probabilistic Residual Automata (PRA)
|
|
- a Probabilistic Deterministic Automaton (PDA)
|
|
|
|
## Features
|
|
|
|
The main features of DEES are:
|
|
|
|
- Multiplicity automata (MA) inference from a sample of sequences.
|
|
- Probabilistic Automata (PA) Inference
|
|
- Probabilistic Deterministic Automata (PDA) Inference
|
|
|
|
This repository also contains many other features:
|
|
|
|
- Viterbi algorithm
|
|
- Baulm-Welch algorithm
|
|
- Random generation methods of MA, PA, PRA and PDA
|
|
- GraphViz export of models
|
|
- Sample generation from an MA
|
|
- Model class detection (MA, PA, PRA, PDA)
|
|
- Compute, if it exists, the sum of all values of all the words of a distribution generated by a MA.
|
|
- Convertion between Alergia, MDI and DEES file format ; sample and automata
|
|
- Generation of the trimmed MA of an MA in linear time
|
|
- If GraphViz is intalled, model are shown and export them in PDF
|
|
|
|
## More informations
|
|
|
|
http://yann.esposito.free.fr/dees.php?css=blue.css&lang=en
|
|
|
|
If you want all the gory details you can check my [Ph.D. thesis](http://yann.esposito.free.fr/pub/these.pdf) written in French.
|
|
|
|
Contact me if you have question.
|
|
I'll be happy to talk to you.
|