Skip to content Skip to sidebar Skip to footer

Filter A GeoPandas Dataframe Within A Polygon And Remove From The Dataframe The Ones Who Are Not There

I have a .csv file which contains some points (longitude, latitude). I converted it to a DataFrame and from DataFrame to a GeoDataFrame with this code: CSV file: Date;User ID;Longi

Solution 1:

The spatial operation within is needed to identify whether a point geometry is located within a polygon geometry. In the code below, all the necessary steps are perform towards the goal of identifying all points that fall within a polygon (Ecuador). At final step, a plot is created to visualize/check the result.

import pandas as pd
import geopandas
from shapely.geometry import Point  #Polygon

df = pd.read_csv('ecuador_data.csv', sep=';', low_memory=False, decimal='.')
world = geopandas.read_file(geopandas.datasets.get_path('naturalearth_lowres'))
ecuador = world[world.name == 'Ecuador'] 

# add new column to df
df['withinQ'] = ""

withinQlist = []
for lon,lat in zip(df.Longitude, df.Latitude):
    pt = Point(lon, lat)
    withinQ = pt.within(ecuador['geometry'].values[0])
    #print( withinQ )
    withinQlist.append(withinQ)

# update values in the that column, values: True/False
df['withinQ'] = withinQlist

# uncomment next line to see content of `df`
#print(df)

#          Date  User_ID  Longitude  Latitude  withinQ
# 0  2020-01-02   824664   -79.8832   -2.1811     True
# 1  2020-03-01   123456    80.8832    2.1811    False
# 2  2020-01-15   147835   -80.7804   -1.4845     True

# select points within ecuador, assign to `result_df` dataframe
result_df = df[df.withinQ==True]
# select points outside ecuador, assign to `xresult_df` dataframe
xresult_df = df[df.withinQ==False]

# for checking/visualization, create a plot of relevant geometries
ax1 = ecuador.plot(color='pink')
ax1.scatter(result_df.Longitude, result_df.Latitude, s=50, color='green')
#ax1.scatter(xresult_df.Longitude, xresult_df.Latitude, s=30, color='red')

The plot:

ecuador

For the resulting dataframe result_df, its content will look like this:

         Date  User_ID  Longitude  Latitude  withinQ
0  2020-01-02   824664   -79.8832   -2.1811     True
2  2020-01-15   147835   -80.7804   -1.4845     True

Solution 2:

For future reference you can use the documentation in this link, I found it very helpful!

The process you are looking for is called Point in Polygon and, as the other answer mentions, you can use the function .within()

Now, with what you already have I would do:

#find point in polygon
#code below returns a series with boolean values
#if value is True it means the point in that index location is within the polygon we are evaluating

pip = gdf.within(ec.loc[0, 'geometry'])

#creating a new geoDataFrame that will have only the intersecting records

ec_gdf = gdf.loc[pip].copy()

#resetting index(optional step if you don't need to keep the original index values)
ec_gdf.reset_index(inplace=True, drop=True)


Post a Comment for "Filter A GeoPandas Dataframe Within A Polygon And Remove From The Dataframe The Ones Who Are Not There"