- Type Parameters:
V- graph vertex typeE- graph edge type
The algorithm is originally described in Zhang, Kaizhong & Shasha, Dennis. (1989). Simple Fast Algorithms for the Editing Distance Between Trees and Related Problems. SIAM J. Comput.. 18. 1245-1262. 10.1137/0218082.
The time complexity of the algorithm is $O(|T_1|\cdot|T_2|\cdot min(depth(T_1),leaves(T_1)) \cdot min(depth(T_2),leaves(T_2)))$. Space complexity is $O(|T_1|\cdot |T_2|)$, where $|T_1|$ and $|T_2|$ denote number of vertices in trees $T_1$ and $T_2$ correspondingly, $leaves()$ function returns number of leaf vertices in a tree.
The tree edit distance problem is defined in a following way. Consider $2$ trees $T_1$ and $T_2$ with root vertices $r_1$ and $r_2$ correspondingly. For those trees there are 3 elementary modification actions:
- Remove a vertex $v$ from $T_1$.
- Insert a vertex $v$ into $T_2$.
- Change vertex $v_1$ in $T_1$ to vertex $v_2$ in $T_2$.
The algorithm is based on a dynamic programming principle and assigns a label to each vertex in the trees which is equal to its index in post-oder traversal. It also uses a notion of a keyroot which is defined as a vertex in a tree which has a left sibling. Additionally a special $l()$ function is introduced with returns for every vertex the index of its leftmost child wrt the post-order traversal in the tree.
Solving the tree edit problem distance is divided into computing edit distance for every pair of subtrees rooted at vertices $v_1$ and $v_2$ where $v_1$ is a keyroot in the first tree and $v_2$ is a keyroot in the second tree.
- Author:
- Semen Chudakov
-
Nested Class Summary
Nested ClassesModifier and TypeClassDescriptionstatic classRepresents elementary action which changes the structure of a tree.static enumType of an edit operation. -
Constructor Summary
ConstructorsConstructorDescriptionConstructs an instance of the algorithm for the giventree1,root1,tree2androot2.ZhangShashaTreeEditDistance(Graph<V, E> tree1, V root1, Graph<V, E> tree2, V root2, ToDoubleFunction<V> insertCost, ToDoubleFunction<V> removeCost, ToDoubleBiFunction<V, V> changeCost) Constructs an instance of the algorithm for the giventree1,root1,tree2,root2,insertCost,removeCostandchangeCost. -
Method Summary
Modifier and TypeMethodDescriptiondoubleComputes edit distance fortree1andtree2.Computes a list of edit operations which transformtree1intotree2.
-
Constructor Details
-
ZhangShashaTreeEditDistance
Constructs an instance of the algorithm for the giventree1,root1,tree2androot2. This constructor sets following default values for the distance functions. TheinsertCostandremoveCostalways return $1.0$, thechangeCostreturn $0.0$ if vertices are equal and1.0otherwise.- Parameters:
tree1- a treeroot1- root vertex oftree1tree2- a treeroot2- root vertex oftree2
-
ZhangShashaTreeEditDistance
public ZhangShashaTreeEditDistance(Graph<V, E> tree1, V root1, Graph<V, E> tree2, V root2, ToDoubleFunction<V> insertCost, ToDoubleFunction<V> removeCost, ToDoubleBiFunction<V, V> changeCost) Constructs an instance of the algorithm for the giventree1,root1,tree2,root2,insertCost,removeCostandchangeCost.- Parameters:
tree1- a treeroot1- root vertex oftree1tree2- a treeroot2- root vertex oftree2insertCost- cost function for inserting a node intotree1removeCost- cost function for removing a node fromtree2changeCost- cost function of changing a node intree1to a node intree2
-
-
Method Details
-
getDistance
public double getDistance()Computes edit distance fortree1andtree2.- Returns:
- edit distance between
tree1andtree2
-
getEditOperationLists
Computes a list of edit operations which transformtree1intotree2.- Returns:
- list of edit operations
-