Package org.jgrapht.sux4j

Immutable graphs stored using Sux4J's quasi-succinct data structures.

This package contains implementations of immutable graphs based on the Elias–Fano quasi-succinct representation of monotone sequences. The positions of the nonzero entries of the adjacency matrix of a graph are represented as a monotone sequence of natural numbers.

The memory footprint of these implementation is close to the information-theoretical lower bound in the undirected case, and close to twice the information-theoretical lower bound in the directed case, because the transposed graph must be stored separately, but in the latter case you have the choice to not support incoming edges and obtain, again, footprint close to the information-theoretical lower bound. The actual space used can be easily measured as all implementations are serializable, and their in-memory footprint is very close to the on-disk footprint. Usually the size is a few times smaller than that of a SparseIntDirectedGraph/SparseIntUndirectedGraph.

We provide two classes mimicking SparseIntDirectedGraph and SparseIntUndirectedGraph, in the sense that both vertices and edges are integers (and they are numbered contiguously). Thus, by definition these classes cannot represent graphs with more than Integer.MAX_VALUE edges.

The sometimes slow behavior of the previous classes is due to a clash between JGraphT's design and the need of representing an edge with an Integer, which cannot be extended: there is no information that can be carried by the object representing the edge. This limitation forces the two classes above to compute two expensive functions that are one the inverse of the other.

As an alternative, we provide classes SuccinctDirectedGraph and SuccinctUndirectedGraph using the same amount of space, but having edges represented by pairs of integers stored in an IntIntPair (for directed graphs) or an IntIntSortedPair (for undirected graphs). Storing the edges explicitly avoids the cumbersome back-and-forth computations of the previous classes. All accessors are extremely fast. There is no limitation on the number of edges.

Both classes provide methods getEdgeFromIndex() and getIndexFromEdge() that map bijectively the edge set into a contiguous set of longs. In this way the user can choose when and how to use the feature (e.g., to store compactly data associated to edges).

Finally, note that the best performance and compression can be obtained by representing the graph using WebGraph's EFGraph format and then accessing the graph using the suitable adapter; in particular, one can represent graphs with more than Integer.MAX_VALUE vertices. However, the adapters do not provide methods mapping bijectively edges into a contiguous set of integers.

Building and serializing with limited memory

All implementations provide a copy constructor taking a Graph and a constructor accepting a list of edges; the latter just builds a sparse graph and delegates to the copy constructor. Both methods can be inconvenient if the graph to be represented is large, as the list of edges might have too large a footprint.

There is however a simple strategy that makes it possible to build succinct representations using a relatively small amount of additional memory with respect to the representation itself:

  1. convert your graph to a WebGraph format such as BVGraph or EFGraph;
  2. if your graph is directed, use Transform to store the transpose of your graph in the same way;
  3. use a suitable adapter to get a Graph representing your graph, taking care of loading the WebGraph representations using ImmutableGraph.loadMapped(java.lang.CharSequence) ImmutableGraph.loadMapped()};
  4. use the copy constructor to obtain a quasi-succinct representation.