Python geopandas - replace append with concat

Question

Python-beginner here. I'm trying to fix a deprecation warning in:

df = gpd.GeoDataFrame(columns=['location', 'geometry'])
for dir, subdir, files in os.walk(StartDir):
    for fname in files:
        if fname.endswith(".tif"):
            df = df.append({'location': fname, 'geometry': getBounds(os.path.join(dir+"/", fname))}, ignore_index=True)

by replacing the append line with:

df = gpd.pd.concat(df,{'location': fname, 'geometry': getBounds(os.path.join(dir+"/", fname))}, ignore_index=True)

which leads to this error message:

TypeError: first argument must be an iterable of pandas objects, you passed an object of type "GeoDataFrame"

What am I missing?

concat is expecting a list of (geo)dataframes as the first argument. See the docs about creating a GeoDataframe from your data. This GIS SE thread might also be useful. — Matt, Oct 14 '22 at 23:48
Please note the "Python-beginner here" in my question. I'm still struggling especially with data types and their names. — Stefan Gofferje, Oct 15 '22 at 08:08

Matt · Accepted Answer · 2022-10-15T11:00:41.243

In order for concat to work, it needs a list of dataframes. You can make a temporary dataframe in each iteration of your loop using the dictionary notation {} you already have, but by passing it to the pd.DataFrame constructor, like so:

df = gpd.GeoDataFrame(columns=['location', 'geometry'])
for dir, subdir, files in os.walk(StartDir):
    for fname in files:
        if fname.endswith(".tif"):
        # create a temporary df with the desired values, it necessary to specify the index
        df_to_append = gpd.pd.DataFrame({'location': fname, 'geometry': getBounds(os.path.join(dir+&quot;/&quot;, fname))}, index=[0])

        # here the temporary dataframe is appended to the original dataframe each iteration of the loop
        # by passing a list [] of the orignal dataframe and the temporary one to `pd.concat()`  
        # importantly, the index is now ignored to renumber each row sequentially
        df = gpd.pd.concat([df, df_to_append], ignore_index=True)

That works, thanks. I think, I understand it also. The main point seems to be that Python or the library does not understand that the object is a DataFrame without creating an object of the type DataFrame and assigning the data to it. Is that a Python thing or specific to the library? — Stefan Gofferje, Oct 15 '22 at 19:33

Python geopandas - replace append with concat

1 Answers1