- ` by simpler name `M` for example.
I obtained something like:
<%= blogimage('formal_DCR_tree.png', 'The source tree') %>
and
<%= blogimage('formal_Menu_tree.png', 'The destination tree') %>
Then I made myself the following reflexion:
Considering Tree Edit Distance, each unitary transformation of tree correspond to a simple search and replace on my xml source[^nb].
We consider three atomic transformations on trees:
- *substitution*: renaming a node
- *insertion*: adding a node
- *deletion*: remove a node
[^nb]: I did a program which generate automatically the weight in a matrix of each edit distance from data.
One of the particularity of atomic transformations on trees, is ; if you remove a node, all children of this node, became children of its father.
An example:
r - x - a
\ \
\ b
y - c
If you delete the `x` node, you obtain
a
/
r - b
\
y - c
Et regardez ce que ça implique quand on l'écrit en xml :
value for a
vblue for b
value for c
Then deleting all `x` nodes is equivalent to pass the xml via the following search and replace script:
s/<\/?x>//g
Therefore, if there exists a one state deterministic transducer which transform my trees ;
I can transform the xml from one format to another with just a simple list of search and replace directives.
# Solution
Transform this tree:
R - C - tag1
\ \
\ tag2
E -- R - C - tag1
\ \ \
\ \ tag2
\ E ...
R - C - tag1
\ \
\ tag2
E ...
to this tree:
tag1
/
M - V - M - V - tag2 tag1
\ /
M --- V - tag2
\ \
\ M
\ tag1
\ /
V - tag2
\
M
can be done using the following one state deterministic tree transducer:
> C -> ε
> E -> R
> R -> V
Wich can be traduced by the following simple search and replace directives:
s/C//g
s/E/M/g
s/R/V/g
Once adapted to xml it becomes:
s%?contenu>%%g
s%%- %g
s%
%- %g
s%?rubrique>%%g
s%%%g
That is all.
# Conclusion
It should seems a bit paradoxal, but sometimes the most efficient approach to a pragmatic problem is to use the theoretical methodology.