0

I haven't been able to find a solution with the source so am trying to work on the output file. I have also asked this question on SO

I have a large (200MB) geoJSON file that has a lot of complex polygons and multipolygons. A very truncated example is at https://gist.github.com/jinky32/81f61e1fc118822ba103?short_path=d16949b

As you can see this file is comprised of polygons and multipolygons that have a String property of either 1 or 2. Below is an example of how these shapes look on mapshaper.org when highlighting a multipolygon of either value in the same tile (essentially c.90+% of this tile is made up of a multipolygon with one value or the other)

Same tile with different values selected

I do not need to differentiate between these different values and polygons / multipolygons with a String value of either 1 or 2 can be combined together which I hope will reduce the file size.

Can anyone advise how I can achieve this - preferably with a cli tool?

Stuart Brown
  • 165
  • 1
  • 6
  • You are hoping to reduce the file size by removing the "String": "1" or "String": "2" from "properties": { "Float": -118.000000, "String": "2" or are you saying you can drop the entire feature if there is both a string 1 or string 2 with the same geometry? – John Powell Mar 15 '16 at 09:48
  • Thanks @JohnBarça I'm saying that polygons / multipolygons with value 1 or 2 that share a boundary can be merged together because I do not need to distinguish between them. I hope that this will also save me some file size – Stuart Brown Mar 15 '16 at 10:05
  • That share a boundary. In that case, have you looked into TopoJSON? There is a nodejs package too. – John Powell Mar 15 '16 at 10:07
  • Thanks. I had looked at topojson.merge based on the US state example. I'll give it a go. I seem to recall my laptop ran out of memory.... – Stuart Brown Mar 15 '16 at 10:23
  • Have you checked this question? - http://gis.stackexchange.com/questions/149959/dissolve-polygons-based-on-attributes-with-python-shapely-fiona – sema Mar 15 '16 at 11:10

1 Answers1

2

That's easy to do with ogr2ogr http://www.gdal.org/ogr2ogr.html and GDAL SQLite dialect http://www.gdal.org/ogr_sql_sqlite.html.

An example using your sample data:

ogr2ogr -f "GeoJSON" -dialect sqlite -sql "select st_union(geometry) as geometry from OGRGeoJSON where string in ('1','2')" gj_union_test.json geojsontest.json

Check the result with ogrinfo:

ogrinfo gj_union_test.json -al -so
INFO: Open of `gj_union_test.json'
      using driver `GeoJSON' successful.

Layer name: OGRGeoJSON
Geometry: Multi Polygon
Feature Count: 1
Extent: (50600.010000, 301849.995000) - (653900.010000, 576205.560000)
Layer SRS WKT:
GEOGCS["WGS 84",
    DATUM["WGS_1984",
        SPHEROID["WGS 84",6378137,298.257223563,
            AUTHORITY["EPSG","7030"]],
        AUTHORITY["EPSG","6326"]],
    PRIMEM["Greenwich",0,
        AUTHORITY["EPSG","8901"]],
    UNIT["degree",0.0174532925199433,
        AUTHORITY["EPSG","9122"]],
    AUTHORITY["EPSG","4326"]]

As you can see there is now only one MultiPolygon feature. Another thing to notice is that if your GeoJSON don't use WGS84 coordinates you should add the CRS object http://geojson.org/geojson-spec.html#coordinate-reference-system-objects.

user30184
  • 65,331
  • 4
  • 65
  • 118
  • thanks @user30184 I ran that but get an essentially empty file { "type": "FeatureCollection",

    "features": [ { "type": "Feature", "properties": { "geometry": null }, "geometry": null } ] } I did get an error GEOS error: TopologyException: Input geom 1 is invalid: Self-intersection at or near point 437499.98999999999 376200 at 437499.98999999999 376200

    – Stuart Brown Mar 15 '16 at 12:16
  • running info returned Layer name: OGRGeoJSON Geometry: Unknown (any) Feature Count: 1 Layer SRS WKT: GEOGCS["WGS 84", DATUM["WGS_1984", SPHEROID["WGS 84",6378137,298.257223563, AUTHORITY["EPSG","7030"]], AUTHORITY["EPSG","6326"]], PRIMEM["Greenwich",0, AUTHORITY["EPSG","8901"]], UNIT["degree",0.0174532925199433, AUTHORITY["EPSG","9122"]], AUTHORITY["EPSG","4326"]] geometry: String (0.0) i.e. no Extent value – Stuart Brown Mar 15 '16 at 12:17
  • With the data you put into https://gist.github.com/jinky32/81f61e1fc118822ba103?short_path=d16949b Odd, because I used just that in my test after saving the data on disk with name "geojsontest.json". – user30184 Mar 15 '16 at 12:19
  • OK, so you have topology error in the data. Fix the errors or skip faulty geometries by adding AND ST_IsValid=1 into the SQL. If that does not help it may be that ridiculous accuracy of coordinates with picometers http://www.simetric.co.uk/siprefix.htm makes trouble 437499.98999999999. – user30184 Mar 15 '16 at 12:23
  • thanks again. Updating query to "select st_union(geometry) as geometry from OGRGeoJSON where string in ('1','2') AND ST_IsValid=1" gives an error no such column: ST_IsValid. Re fixing the errors, is there a makeValid or somesuch option? – Stuart Brown Mar 15 '16 at 12:31
  • Sorry, the syntax is naturally ST_IsValid(geometry). You can check syntax of Spatialite functions from https://www.gaia-gis.it/gaia-sins/spatialite-sql-latest.html as well as which functions exist. Notice that GDAL may not have the latest Spatialite and some functions may be missing. – user30184 Mar 15 '16 at 13:16
  • thanks for all your help @user30184! We are trying to run ogr2ogr -f "GeoJSON" -dialect sqlite -sql "select ST_UNION(SanitizeGeometry(geometry)) as geometry from OGRGeoJSON where String in ('1','2')" LTE800-indoor-union.json LTE800-indoor-G100.json i.e. using SanitizeGeometry to try and fix the faulty geometry but still get the same issue. Should we try some other function to fix? – Stuart Brown Mar 15 '16 at 14:24
  • also running ogr2ogr -f "GeoJSON" -dialect sqlite -sql "select ST_UNION(ST_IsValid(geometry)) as geometry from OGRGeoJSON where String in ('1','2') AND ST_IsValid(geometry) = 1" LTE800-indoor-union.json 1800-indoor96.geojson produces a few errors such as GEOS warning: Self-intersection at or near point 438300 376200 and then results in an empty file – Stuart Brown Mar 15 '16 at 14:30
  • I suppose that you used MakeValid in the first place because IsValid would not work there. Anyway this is going off-topic. You know what to do with proper data. Make another question about how to fix the topology of GeoJSON. – user30184 Mar 15 '16 at 15:10
  • agreed. I have marked yours as the answer – Stuart Brown Mar 15 '16 at 15:19