4

I am using Fiona to load a set of links. Then I put every record in the rows of a pandas.DataFrame. Then I get the centroid of each LineString using Shapely. Afterwards I copy the schema of the original links.shp and change its geometry to 'Point'. Then I write the records from the DataFrame, changing only the coordinates. Take a look:

import pandas as pd
import fiona
from shapely.geometry import LineString
def links_centroider(links, name_for_centroids):
    collection = fiona.open(links)
    records = []
    for i in range(len(collection)):
        records.append( next(collection))
    collection.close()
    geoDF = pd.DataFrame({'type': [i['geometry']['type'] for i in records],
                          'properties': [i['properties'] for i in records],
                          'coordinates': [i['geometry']['coordinates'] for i in records]},
                         index = [i['id'] for i in records])
    geoDF['centroid'] = geoDF.coordinates.apply(lambda x: LineString(x).centroid.coords[:][0])
    with fiona.open(links) as source:
        source_driver = source.driver
        source_crs = source.crs
        source_schema = source.schema
        source_schema['geometry'] = 'Point'
    with fiona.open(name_for_centroids,
                    'w',
                    driver=source_driver,
                    crs=source_crs,
                    schema=source_schema) as collection:
        print(len(collection))
        for i in geoDF.index:
            a_record = {'geometry':{'coordinates':geoDF.loc[i].centroid,
                                    'type': 'Point'},
                        'id': str(i),
                        'properties': geoDF.loc[i].properties,
                        'type': 'Feature'}
            collection.write(a_record)
        print(len(collection))
    print collection.closed

The output shapefile has no info for string properties. For numbers in those properties I get mostly zeros. In other words, values in the fields of the original attribute table of links.shp are not being written to the centroids.shp I want to get.

Any idea about how to solve this?

Jaqo
  • 153
  • 1
  • 10

1 Answers1

4

Then I put every record in the rows of a pandas.DataFrame

Why ?

If you only want to copy the original attributes (LineString) to the new shapefile (Points), after computing the centroid, you don't need Pandas:

import fiona
from shapely.geometry import shape, mapping 
with fiona.open("polyline.shp") as input:
    # change only the geometry of the schema: LineString -> Point
    input.schema['geometry'] = "Point"
    # write the Point shapefile
    with fiona.open('centroid.shp', 'w', 'ESRI Shapefile', input.schema.copy(), input.crs) as output:
       for elem in input:
           # GeoJSON to shapely geometry
           geom = shape(elem['geometry'])
           # shapely centroid to GeoJSON
           elem['geometry'] = mapping(geom.centroid)
           output.write(elem)

If you absolutely want to use Pandas, use GeoPandas which "mix" Pandas, Fiona and shapely.

import geopandas as gp
input = gp.read_file('polyline.shp')
print type(input)
<class 'geopandas.geodataframe.GeoDataFrame'> -> a GeoDataFrame
print input['geometry']
0  LINESTRING (266351.05107 161433.039507, 266362...  
....
# only change the geometry of the dataframe
input['geometry'] = input['geometry'].centroid
print input['geometry']
0    POINT (266369.1881962401 161457.6017265563)
....
# save resulting shapefile
input.to_file("centroids.shp")
gene
  • 54,868
  • 3
  • 110
  • 187
  • Thanks @gene. Using pandas has become an habit. I prefer to do operations in this way rather than looping. It also allows me to do several other things, such as database-like operations. I am working with networks and the amount of features is generally big. Perhaps I am just avoiding PostGIS. You are right though, for the main purpose of this question, pandas is not essential. In your example, when you refer to c.schema.copy() and c.crs), do you mean input.schema.copy() and input.crs() ? – Jaqo Oct 16 '14 at 16:01
  • 1
    yes sorry, a mistake, corrected. I give you another solution with GeoPandas (= Geospatial Pandas) without "apparent" looping. – gene Oct 16 '14 at 16:16
  • Could it be that the opening of the parenthesis in input.crs() is still missing? – Jaqo Oct 16 '14 at 16:22
  • Also, I have the impression that the term input might be reserved for something else in python. Your suggestion is clear, but naming the input collection differently, like input_shp, will lead to an even better answer for people reading at this post in the future. – Jaqo Oct 16 '14 at 16:31
  • It is fiona.open('centroid.shp', ..., input.crs)and the name of the variable input cannot be confused with the function input() of Python 3.x (What's the difference between raw_input() and input() in python3.x? – gene Oct 16 '14 at 16:41
  • 1
    and in Python 2.x - > input() -> input = ...is different from s = input() and the name is not reserved. – gene Oct 16 '14 at 16:55
  • Thanks for you support @gene. Your lean fiona + shapely implementation did the work. I have not tried geopandas since I have not installed it... yet. – Jaqo Oct 16 '14 at 17:03
  • Just one more thing: I initially posted this question in Stack Overflow, and it is still there without an answer. What should I do? – Jaqo Oct 16 '14 at 17:07
  • a link to this post ? – gene Oct 16 '14 at 17:35