- java.lang.Object
-
- org.jgrapht.alg.similarity.ZhangShashaTreeEditDistance<V,E>
-
- Type Parameters:
V
- graph vertex typeE
- graph edge type
public class ZhangShashaTreeEditDistance<V,E> extends Object
Dynamic programming algorithm for computing edit distance between trees.The algorithm is originally described in Zhang, Kaizhong & Shasha, Dennis. (1989). Simple Fast Algorithms for the Editing Distance Between Trees and Related Problems. SIAM J. Comput.. 18. 1245-1262. 10.1137/0218082.
The time complexity of the algorithm is $O(|T_1|\cdot|T_2|\cdot min(depth(T_1),leaves(T_1)) \cdot min(depth(T_2),leaves(T_2)))$. Space complexity is $O(|T_1|\cdot |T_2|)$, where $|T_1|$ and $|T_2|$ denote number of vertices in trees $T_1$ and $T_2$ correspondingly, $leaves()$ function returns number of leaf vertices in a tree.
The tree edit distance problem is defined in a following way. Consider $2$ trees $T_1$ and $T_2$ with root vertices $r_1$ and $r_2$ correspondingly. For those trees there are 3 elementary modification actions:
- Remove a vertex $v$ from $T_1$.
- Insert a vertex $v$ into $T_2$.
- Change vertex $v_1$ in $T_1$ to vertex $v_2$ in $T_2$.
The algorithm is based on a dynamic programming principle and assigns a label to each vertex in the trees which is equal to its index in post-oder traversal. It also uses a notion of a keyroot which is defined as a vertex in a tree which has a left sibling. Additionally a special $l()$ function is introduced with returns for every vertex the index of its leftmost child wrt the post-order traversal in the tree.
Solving the tree edit problem distance is divided into computing edit distance for every pair of subtrees rooted at vertices $v_1$ and $v_2$ where $v_1$ is a keyroot in the first tree and $v_2$ is a keyroot in the second tree.
- Author:
- Semen Chudakov
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description static class
ZhangShashaTreeEditDistance.EditOperation<V>
Represents elementary action which changes the structure of a tree.static class
ZhangShashaTreeEditDistance.OperationType
Type of an edit operation.
-
Constructor Summary
Constructors Constructor Description ZhangShashaTreeEditDistance(Graph<V,E> tree1, V root1, Graph<V,E> tree2, V root2)
Constructs an instance of the algorithm for the giventree1
,root1
,tree2
androot2
.ZhangShashaTreeEditDistance(Graph<V,E> tree1, V root1, Graph<V,E> tree2, V root2, ToDoubleFunction<V> insertCost, ToDoubleFunction<V> removeCost, ToDoubleBiFunction<V,V> changeCost)
Constructs an instance of the algorithm for the giventree1
,root1
,tree2
,root2
,insertCost
,removeCost
andchangeCost
.
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description double
getDistance()
Computes edit distance fortree1
andtree2
.List<ZhangShashaTreeEditDistance.EditOperation<V>>
getEditOperationLists()
Computes a list of edit operations which transformtree1
intotree2
.
-
-
-
Constructor Detail
-
ZhangShashaTreeEditDistance
public ZhangShashaTreeEditDistance(Graph<V,E> tree1, V root1, Graph<V,E> tree2, V root2)
Constructs an instance of the algorithm for the giventree1
,root1
,tree2
androot2
. This constructor sets following default values for the distance functions. TheinsertCost
andremoveCost
always return $1.0$, thechangeCost
return $0.0$ if vertices are equal and1.0
otherwise.- Parameters:
tree1
- a treeroot1
- root vertex oftree1
tree2
- a treeroot2
- root vertex oftree2
-
ZhangShashaTreeEditDistance
public ZhangShashaTreeEditDistance(Graph<V,E> tree1, V root1, Graph<V,E> tree2, V root2, ToDoubleFunction<V> insertCost, ToDoubleFunction<V> removeCost, ToDoubleBiFunction<V,V> changeCost)
Constructs an instance of the algorithm for the giventree1
,root1
,tree2
,root2
,insertCost
,removeCost
andchangeCost
.- Parameters:
tree1
- a treeroot1
- root vertex oftree1
tree2
- a treeroot2
- root vertex oftree2
insertCost
- cost function for inserting a node intotree1
removeCost
- cost function for removing a node fromtree2
changeCost
- cost function of changing a node intree1
to a node intree2
-
-
Method Detail
-
getDistance
public double getDistance()
Computes edit distance fortree1
andtree2
.- Returns:
- edit distance between
tree1
andtree2
-
getEditOperationLists
public List<ZhangShashaTreeEditDistance.EditOperation<V>> getEditOperationLists()
Computes a list of edit operations which transformtree1
intotree2
.- Returns:
- list of edit operations
-
-