Class GpkgRoadGraphPreprocessor

java.lang.Object
org.jgrapht.osm.GpkgRoadGraphPreprocessor

public final class GpkgRoadGraphPreprocessor extends Object
Reads a Geofabrik-style OpenStreetMap GPKG snapshot of a region's road network and writes the largest strongly-connected component as a pair of gzipped CSV files that OsmCsvGraphLoader loads back into a Graph<Integer, ...>.

Output schema (both files are headerless, gzip-compressed UTF-8 CSV):

  • <prefix>.csv.gzsrc,dst,weight_m (one directed edge per line, weight is the Haversine great-circle distance in metres). Headerless so the file can be consumed directly by CSVImporter from jgrapht-io.
  • <prefix>.nodes.csv.gznode_id,lat,lon (one vertex per line, coordinates in decimal degrees). Loaded via OsmCoordinatesReader.

Invoke from the command line:


 java --module-path <...> --module org.jgrapht.osm/org.jgrapht.osm.GpkgRoadGraphPreprocessor \
     /path/to/region.gpkg \
     /path/to/region-edges.csv.gz
 

or programmatically via run(Path, Path) from any test or application code.

GPKG schema assumptions

Geofabrik free-tier extracts (e.g. https://download.geofabrik.de/europe/andorra-latest-free.gpkg.zip) ship a gis_osm_roads_free table whose columns include oneway (B / F / T, default B) and geom (GPKG-wrapped WKB LINESTRING). The preprocessor filters to a routable fclass whitelist (motorway through service), parses each line-string into directed edges, runs Tarjan-equivalent SCC analysis via KosarajuStrongConnectivityInspector, keeps the largest component, deduplicates parallel edges keeping the shortest, and writes the result.

The class only depends on the GPKG / SQLite layout, not on any specific region, so it works against any free-tier Geofabrik download.

Author:
Shai Eilat
  • Field Details

    • ROUTABLE_FCLASSES

      public static final Set<String> ROUTABLE_FCLASSES
      Geofabrik fclass values considered routable road segments.
    • COORD_PRECISION

      public static final double COORD_PRECISION
      Multiplier for snapping (lon, lat) pairs to integer keys before deduping shared endpoints between adjacent line-strings. 1e7 = ~1.1 cm at the equator, coarse enough to dedupe touching segments and fine enough to keep distinct nearby intersections distinct.
      See Also:
  • Method Details

    • main

      public static void main(String[] args) throws Exception
      CLI entry point.
      Parameters:
      args - two arguments: <input.gpkg> and <output-edges.csv.gz>
      Throws:
      Exception - on I/O, SQL, or parse failure
    • run

      public static GpkgRoadGraphPreprocessor.Result run(Path gpkgPath, Path edgesOutPath) throws IOException, SQLException
      Library entry point.
      Parameters:
      gpkgPath - path to the input Geofabrik GPKG
      edgesOutPath - destination for the edges CSV; the nodes CSV is written alongside with suffix .nodes.csv.gz
      Returns:
      summary statistics
      Throws:
      IOException - on I/O failure
      SQLException - on GPKG / SQLite read failure