2

I am trying to intersect some points with a polygon in GeoPandas.

First I create a Point GeoDataFrame from a text file.

import geopandas as gp
import numpy as np
import pandas as pd
from shapely.geometry import Point, Polygon

tile_index = pd.read_csv('../tile_index.dat', names=['tile', 'x', 'y'], sep=' ')
geometry = [Point(r.x, r.y) for ix, r in tile_index.iterrows()]
tile_index = gp.GeoDataFrame(tile_index, geometry=geometry)
tile_index = tile_index[(tile_index.x.between(-10, 30)) & (tile_index.y.between(-10, 30))]
print tile_index.head()

Which produces:

     tile      x       y                                        geometry
202   226 -9.226  -0.724  POINT (-9.226000000000001 -0.7240000000000001)
203   227 -9.226   9.276                POINT (-9.226000000000001 9.276)
204   228 -9.226  19.276               POINT (-9.226000000000001 19.276)
205   229 -9.226  29.276               POINT (-9.226000000000001 29.276)
219   243  0.774  -0.724               POINT (0.774 -0.7240000000000001)

Then I create a Polygon to intersect:

poly = Polygon([(0, 0), (0, 20), (20, 20), (20, 0)])
P = gp.GeoDataFrame(data=[1, poly]).T
P.columns = ['ID', 'geometry']
print np.any(tile_index.within(P))

Which returns False but at least 4 points should be within.

enter image description here

If I create a point using the coordinates of point 237 it does intersect.

pt = Point(10.774, 9.276)
p = gp.GeoDataFrame([[1, pt]])
p.columns = ['ID', 'geometry']
print p.within(P)

Which returns True.

If I do P.contains(tile_index.geometry) then all are False but if I do np.any([P.contains(row.geometry)[0] for ix, row in tile_index.iterrows()]) then this returns True.

user2856
  • 65,736
  • 6
  • 115
  • 196
kungphil
  • 373
  • 5
  • 18

2 Answers2

6

geopandas aligns frames on the index, and then performs within or contains element-wise. There's a fuller description in this issue, the most relevant part of which is the example excerpted here:

>>> p1 = shapely.geometry.Polygon([(0,0), (1,0), (1,1), (0,1)])

>>> p2 = shapely.geometry.Polygon([(1,1), (2,1), (2,2), (1,2)])  
>>> p1.intersects(p2)
True
>>> s1 = geopandas.GeoSeries([p1])
>>> s2 = geopandas.GeoSeries([p2])
>>> s1.intersects(s2)
0    True
dtype: bool

# giving it another index, s1 and s2 will now first align (and in this case return False
# for each index, as there are no polygons that intersect and have matching index)
>>> s2 = geopandas.GeoSeries([p2], index=[1])
>>> s1.intersects(s2)
0    False
1    False
dtype: bool

This behavior is definitely not evident, and would be great to document more clearly in geopandas. That said, depending on your application, a spatial join might be the operation you're looking for.

jdmcbr
  • 904
  • 1
  • 6
  • 10
1

@jdmcbr's explanation helped me solve the issue but I did so in a different way because my input data were lines, not points. A spatial join would not have been sufficient because I wanted to know which lines were totally contained by the polygon, not just which ones intersected.

I used a lambda function, applying over the lines GeoDataFrame. This will probably only work if you have a single polygon (or multipolygon) in your geodataframe, to get this you can use GeoDataFrame.dissolve()

# lines_gdf = GeoDataFrame containing lines
# poly_gdf = GeoDataFrame containing a polygon or multipolygon

lines_gdf["within"] = lines_gdf["geometry"].apply(lambda geom: poly_gdf["geometry"].contains(geom)) selected_gdf = lines_gdf[lines_gdf["within"]]

wfgeo
  • 3,538
  • 2
  • 24
  • 47