Class SimpleGEXFImporter<V,​E>

  • Type Parameters:
    V - the graph vertex type
    E - the graph edge type
    All Implemented Interfaces:
    GraphImporter<V,​E>

    public class SimpleGEXFImporter<V,​E>
    extends BaseEventDrivenImporter<V,​E>
    implements GraphImporter<V,​E>
    Imports a graph from a GEXF data source.

    This is a simple implementation with supports only a limited set of features of the GEXF specification, oriented towards parsing speed.

    The importer uses the graph suppliers (Graph.getVertexSupplier() and Graph.getEdgeSupplier()) in order to create new vertices and edges. Moreover, it notifies lazily and completely out-of-order for any additional vertex, edge or graph attributes in the input file. Users can register consumers for vertex, edge and graph attributes after construction of the importer. Finally, default attribute values and any nested elements are completely ignored.

    For a description of the format see https://gephi.org/gexf/format/index.html or the GEXF Primer.

    Below is small example of a graph in GEXF format.

     
     <?xml version="1.0" encoding="UTF-8"?>
     <gexf xmlns="http://www.gexf.net/1.2draft"
           xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
           xsi:schemaLocation="http://www.gexf.net/1.2draft http://www.gexf.net/1.2draft/gexf.xsd"
           version="1.2">
       <graph defaultedgetype="undirected">
         <nodes>
           <node id="n0" label="node 0"/>
           <node id="n1" label="node 1"/>
           <node id="n2" label="node 2"/>
           <node id="n3" label="node 3"/>
           <node id="n4" label="node 4"/>
           <node id="n5" label="node 5"/>
         </nodes>
         <edges>
           <edge id="e0" source="n0" target="n2" weight="1.0"/>
           <edge id="e1" source="n0" target="n1" weight="1.0"/>
           <edge id="e2" source="n1" target="n3" weight="2.0"/>
           <edge id="e3" source="n3" target="n2"/>
           <edge id="e4" source="n2" target="n4"/>
           <edge id="e5" source="n3" target="n5"/>
           <edge id="e6" source="n5" target="n4" weight="1.1"/>
         </edges>
       </graph>
     </gexf>
     
     

    The importer reads the input into a graph which is provided by the user. In case the graph is weighted and the corresponding edge attribute "weight" is defined, the importer also reads edge weights. Otherwise edge weights are ignored. To test whether the graph is weighted, method Graph.getType() can be used.

    The provided graph object, where the imported graph will be stored, must be able to support the features of the graph that is read. For example if the GEXF file contains self-loops then the graph provided must also support self-loops. The same for multiple edges. Moreover, the parser completely ignores the global attribute "defaultedgetype" and the edge attribute "type" which denotes whether an edge is directed or not. Whether edges are directed or not depends on the underlying implementation of the user provided graph object.

    The importer by default validates the input using the 1.2draft GEXF Schema. The user can (not recommended) disable the validation by calling setSchemaValidation(boolean). Older schemas are not supported.

    The graph vertices and edges are build using the corresponding graph suppliers. The id of the vertices in the input file are reported as a vertex attribute named DEFAULT_VERTEX_ID_KEY.

    The default behavior of the importer is to use the graph vertex supplier in order to create vertices. The user can also bypass vertex creation by providing a custom vertex factory method using setVertexFactory(Function). The factory method is responsible to create a new graph vertex given the vertex identifier read from file.

    Author:
    Dimitrios Michail
    • Field Detail

      • DEFAULT_VERTEX_ID_KEY

        public static final String DEFAULT_VERTEX_ID_KEY
        Default key used for vertex ID.
        See Also:
        Constant Field Values
    • Constructor Detail

      • SimpleGEXFImporter

        public SimpleGEXFImporter()
        Constructs a new importer.
    • Method Detail

      • isSchemaValidation

        public boolean isSchemaValidation()
        Whether the importer validates the input
        Returns:
        true if the importer validates the input
      • setSchemaValidation

        public void setSchemaValidation​(boolean schemaValidation)
        Set whether the importer should validate the input
        Parameters:
        schemaValidation - value for schema validation
      • getVertexFactory

        public Function<String,​V> getVertexFactory()
        Get the user custom vertex factory. This is null by default and the graph supplier is used instead.
        Returns:
        the user custom vertex factory
      • setVertexFactory

        public void setVertexFactory​(Function<String,​V> vertexFactory)
        Set the user custom vertex factory. The default behavior is being null in which case the graph vertex supplier is used. If supplied the vertex factory is called every time a new vertex is encountered in the file. The method is called with parameter the vertex identifier from the file and should return the actual graph vertex to add to the graph.
        Parameters:
        vertexFactory - a vertex factory
      • importGraph

        public void importGraph​(Graph<V,​E> graph,
                                Reader input)
        Import a graph.

        The provided graph must be able to support the features of the graph that is read. For example if the GraphML file contains self-loops then the graph provided must also support self-loops. The same for multiple edges.

        Specified by:
        importGraph in interface GraphImporter<V,​E>
        Parameters:
        graph - the output graph
        input - the input reader
        Throws:
        ImportException - in case an error occurs, such as I/O or parse error