4

I'm looking for a process to convert ASCII gridded data (in this case 60min/1 degree gridded ASCII population data (GPW) from SEDAC: https://sedac.ciesin.columbia.edu/data/set/gpw-v4-population-count-adjusted-to-2015-unwpp-country-totals-rev11/data-download) to a CSV with three columns: lon, lat, and value using open source tools.

This is the data header:

ncols         360
nrows         180
xllcorner     -180
yllcorner     -90
cellsize      1.0000000000001
NODATA_value  -9999

so the data is structured as space-separated values, with each value occupying a position within 360 columns and 180 rows that represents its coordinates via that position:

-9999 -9999 -9999 -9999 -9999 -9999 0.000936705 0.002529013 0.001377391 0.001723122 0.0004472999 ... 

I only want to include positive values in the lon/lat CSV.

I've been using GDAL for other tasks and have been looking in the documentation, but I'm not seeing this type of data manipulation in its scope.

snowman2
  • 7,321
  • 12
  • 29
  • 54
interwebjill
  • 307
  • 3
  • 16

2 Answers2

10

You can do this with rioxarray:

Note: I used data for the column name as values is a pandas specific attribute.

import rioxarray
import pandas

rds = rioxarray.open_rasterio(
    "gpw_v4_population_count_adjusted_to_2015_unwpp_country_totals_rev11_2020_1_deg.tif",
)
rds = rds.squeeze().drop("spatial_ref").drop("band")
rds.name = "data"
df = rds.to_dataframe().reset_index()
df[df.data>=0.0].to_csv("out.csv", index=False)
snowman2
  • 7,321
  • 12
  • 29
  • 54
  • This solution backtracked on my process a bit by using the GeoTiff downloaded from SEDAC instead of the ASCII file. It's a great solution...much better than the brute force python method I was writing for the ASCII file. I wonder if I should change the title of the question to make it more general so others can find it. – interwebjill Apr 12 '20 at 20:01
  • Glad to hear that it worked - it should work the same with an ASCII file. – snowman2 Apr 12 '20 at 21:45
  • Title could be: Convert raster to CSV with lat, lon, and value columns – snowman2 Apr 12 '20 at 21:47
  • Will rioxarray translate between raster and geojson point features as well? I didn't see that in the documentation. On a lark I tried df[df.data>0.0].to_geojson("out.geojson", index=False) (didn't work). – interwebjill Apr 13 '20 at 01:31
  • This may help: https://gis.stackexchange.com/questions/220997/pandas-to-geojson-multiples-points-features-with-python – snowman2 Apr 13 '20 at 01:34
  • There's also geopandas: https://geopandas.org/gallery/create_geopandas_from_pandas.html#creating-a-geodataframe-from-a-dataframe-with-coordinates – snowman2 Apr 13 '20 at 01:38
  • Then, to geojson: https://geopandas.org/io.html#writing-spatial-data – snowman2 Apr 13 '20 at 01:40
  • Wow! super helpful, certainly saved me couple of days – Hrushi Sep 25 '22 at 11:23
3

Even easier using GDAL from the command line, nothing to install!

  1. Create a text file with your longitudes and latitudes, call it xy.txt:
-7, 41
-0.5, 51.4
  1. Create the header of your shiny CSV file
echo Longitude,Latitude,Pop > final.csv
  1. Then pipe this through to gdallocationinfo:
cat xy.txt | tr -d , | gdallocationinfo -wgs84 -valonly \
           gpw_v4_population_count_rev11_2010_2pt5_min.tif > z.txt
  1. Paste xy.txt and z.txt together and append to output final.csv
paste -d,  xy.txt z.txt >> final.csv

You can then use ogr2ogr to convert the CSV to GeoJSON (see here)

ogr2ogr -f GeoJSON out.geojson final.csv \
         -oo X_POSSIBLE_NAMES=Longitude  \
         -oo Y_POSSIBLE_NAMES=Latitude \
         -oo KEEP_GEOM_COLUMNS=NO

From ogrinfo -al out.geojson I get:

INFO: Open of `out.geojson'
      using driver `GeoJSON' successful.

Layer name: final
Geometry: Point
Feature Count: 2
Extent: (-7.000000, 41.000000) - (-0.500000, 51.400000)
Layer SRS WKT:
GEOGCRS["WGS 84",
    DATUM["World Geodetic System 1984",
        ELLIPSOID["WGS 84",6378137,298.257223563,
            LENGTHUNIT["metre",1]]],
    PRIMEM["Greenwich",0,
        ANGLEUNIT["degree",0.0174532925199433]],
    CS[ellipsoidal,2],
        AXIS["geodetic latitude (Lat)",north,
            ORDER[1],
            ANGLEUNIT["degree",0.0174532925199433]],
        AXIS["geodetic longitude (Lon)",east,
            ORDER[2],
            ANGLEUNIT["degree",0.0174532925199433]],
    ID["EPSG",4326]]
Data axis to CRS axis mapping: 2,1
Pop: String (0.0)
OGRFeature(final):0
  Pop (String) = 4.11418008804321
  POINT (-7 41)

OGRFeature(final):1
  Pop (String) = 12097.869140625
  POINT (-0.5 51.4)

Jose
  • 3,332
  • 20
  • 21