## Class ZhangShashaTreeEditDistance<V,​E>

• java.lang.Object
• org.jgrapht.alg.similarity.ZhangShashaTreeEditDistance<V,​E>
• Type Parameters:
V - graph vertex type
E - graph edge type

public class ZhangShashaTreeEditDistance<V,​E>
extends java.lang.Object
Dynamic programming algorithm for computing edit distance between trees.

The algorithm is originally described in Zhang, Kaizhong & Shasha, Dennis. (1989). Simple Fast Algorithms for the Editing Distance Between Trees and Related Problems. SIAM J. Comput.. 18. 1245-1262. 10.1137/0218082.

The time complexity of the algorithm if $O(|T_1|\cdot|T_2|\cdot min(depth(T_1),leaves(T_1)) \cdot min(depth(T_2),leaves(T_2)))$. Space complexity is $O(|T_1|\cdot |T_2|)$, where $|T_1|$ and $|T_2|$ denote number of vertices in trees $T_1$ and $T_2$ correspondingly, $leaves()$ function returns number of leaf vertices in a tree.

The tree edit distance problem is defined in a following way. Consider $2$ trees $T_1$ and $T_2$ with root vertices $r_1$ and $r_2$ correspondingly. For those trees there are 3 elementary modification actions:

• Remove a vertex $v$ from $T_1$.
• Insert a vertex $v$ into $T_2$.
• Change vertex $v_1$ in $T_1$ to vertex $v_2$ in $T_2$.
The algorithm assigns a cost to each of those operations which also depends on the vertices. The problem is then to compute a sequence of edit operations which transforms $T_1$ into $T_2$ and has a minimum cost over all such sequences. Here the cost of a sequence of edit operations is defined as sum of costs of individual operations.

The algorithm is based on a dynamic programming principle and assigns a label to each vertex in the trees which is equal to its index in post-oder traversal. It also uses a notion of a keyroot which is defined as a vertex in a tree which has a left sibling. Additionally a special $l()$ function is introduced with returns for every vertex the index of its leftmost child wrt the post-order traversal in the tree.

Solving the tree edit problem distance is divided into computing edit distance for every pair of subtrees rooted at vertices $v_1$ and $v_2$ where $v_1$ is a keyroot in the first tree and $v_2$ is a keyroot in the second tree.

Author:
Semen Chudakov
• ### Nested Class Summary

Nested Classes
Modifier and Type Class Description
static class  ZhangShashaTreeEditDistance.EditOperation<V>
Represents elementary action which changes the structure of a tree.
static class  ZhangShashaTreeEditDistance.OperationType
Type of an edit operation.
• ### Constructor Summary

Constructors
Constructor Description
ZhangShashaTreeEditDistance​(Graph<V,​E> tree1, V root1, Graph<V,​E> tree2, V root2)
Constructs an instance of the algorithm for the given tree1, root1, tree2 and root2.
ZhangShashaTreeEditDistance​(Graph<V,​E> tree1, V root1, Graph<V,​E> tree2, V root2, java.util.function.ToDoubleFunction<V> insertCost, java.util.function.ToDoubleFunction<V> removeCost, java.util.function.ToDoubleBiFunction<V,​V> changeCost)
Constructs an instance of the algorithm for the given tree1, root1, tree2, root2, insertCost, removeCost and changeCost.
• ### Method Summary

All Methods
Modifier and Type Method Description
double getDistance()
Computes edit distance for tree1 and tree2.
java.util.List<ZhangShashaTreeEditDistance.EditOperation<V>> getEditOperationLists()
Computes a list of edit operations which transform tree1 into tree2.
• ### Methods inherited from class java.lang.Object

clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
• ### Constructor Detail

• #### ZhangShashaTreeEditDistance

public ZhangShashaTreeEditDistance​(Graph<V,​E> tree1,
V root1,
Graph<V,​E> tree2,
V root2)
Constructs an instance of the algorithm for the given tree1, root1, tree2 and root2. This constructor sets following default values for the distance functions. The insertCost and removeCost always return $1.0$, the changeCost return $0.0$ if vertices are equal and 1.0 otherwise.
Parameters:
tree1 - a tree
root1 - root vertex of tree1
tree2 - a tree
root2 - root vertex of tree2
• #### ZhangShashaTreeEditDistance

public ZhangShashaTreeEditDistance​(Graph<V,​E> tree1,
V root1,
Graph<V,​E> tree2,
V root2,
java.util.function.ToDoubleFunction<V> insertCost,
java.util.function.ToDoubleFunction<V> removeCost,
java.util.function.ToDoubleBiFunction<V,​V> changeCost)
Constructs an instance of the algorithm for the given tree1, root1, tree2, root2, insertCost, removeCost and changeCost.
Parameters:
tree1 - a tree
root1 - root vertex of tree1
tree2 - a tree
root2 - root vertex of tree2
insertCost - cost function for inserting a node into tree1
removeCost - cost function for removing a node from tree2
changeCost - cost function of changing a node in tree1 to a node in tree2
• ### Method Detail

• #### getDistance

public double getDistance()
Computes edit distance for tree1 and tree2.
Returns:
edit distance between tree1 and tree2
• #### getEditOperationLists

public java.util.List<ZhangShashaTreeEditDistance.EditOperation<V>> getEditOperationLists()
Computes a list of edit operations which transform tree1 into tree2.
Returns:
list of edit operations