Making a coordinate transformation on a csv file using pyproj?

Question

I mostly got my answer from this original posting: How to reproject 500 CSV files efficiently and easily using QGIS? and using the code from response 7 from blord-castillo.

My problem is that the code, I suspect, is from earlier versions and I am unclear how to update it for pyproj 2.4.1 and python 3.7.6. I'm using Spyder 4.0.1 in Anaconda. I only have one csv file, but I do not think that is really an issue.

I have a csv file with a list of long,lat and I want to make a coordinate transformation to another CRS and save the output to a new csv file with x,y.

I updated the script to my file and projections. Here:

import csv
import pyproj
from functools import partial
from os import listdir, path

#Define some constants at the top
#Obviously this could be rewritten as a class with these as parameters

lon = 'LONGITUDE' #name of longitude field in original files
lat = 'LATITUDE' #name of latitude field in original files
f_x = 'x' #name of new x value field in new projected files
f_y = 'y' #name of new y value field in new projected files
in_path = r'myinput' #input directory
out_path = r'myoutput' #output directory
input_projection = 'epsg:4326' #WGS84
output_projecton = 'epsg:3395' #World mercator

#Get CSVs to reproject from input path
files= [f for f in listdir(in_path) if f.endswith('.csv')]

#Define partial function for use later when reprojecting
project = partial(
    pyproj.transform,
    pyproj.Proj(init=input_projection),
    pyproj.Proj(init=output_projecton))

for csvfile in files:
    #open a writer, appending '_project' onto the base name
    with open(path.join(out_path, csvfile.replace('.csv','_project.csv')), 'wb') as w:
        #open the reader
        with open(path.join( in_path, csvfile), 'rb') as r:
            reader = csv.DictReader(r, dialect='excel')
            #Create new fieldnames list from reader
            # replacing lon and lat fields with x and y fields
            fn = [x for x in reader.fieldnames]
            fn[fn.index(lon)] = f_x
            fn[fn.index(lat)] = f_y
            writer = csv.DictWriter(w, fieldnames=fn)
            #Write the output
            writer.writeheader()
            for row in reader:
                x,y = (float(row[lon]), float(row[lat]))
                try:
                    #Add x,y keys and remove lon, lat keys
                    row[f_x], row[f_y] = project(x, y)
                    row.pop(lon, None)
                    row.pop(lat, None)
                    writer.writerow(row)
                except Exception as e:
                    #If coordinates are out of bounds, skip row and print the error
                    print (e)

I get an error:

Traceback (most recent call last):

  File "file.py", line 43, in <module>
    fn = [x for x in reader.fieldnames]

  File "mypath\csv.py", line 98, in fieldnames
    self._fieldnames = next(self.reader)

Error: iterator should return strings, not bytes (did you open the file in text mode?)

I have tried to compare the current documentation to the code, but I am a novice here and not entirely sure what I need to change. I have already checked that everything works up to this point by adding a line "print (continue)" in a few places.

In your answers, please remember, novice here.

You are opening the file in binary mode with 'rb' (with open(path.join( in_path, csvfile), 'rb') as r:) and not in text mode (with open(path.join( in_path, csvfile), 'rt')or simply with open(path.join( in_path, csvfile)) -> Error — gene, Feb 17 '20 at 17:01

score 1 · Answer 1 · edited Mar 11 '22 at 15:40

1

I would recommend either:

pandas and pyproj.Transforner for the most efficient method:

import pandas
df = pandas.read_csv(...)

Then see: https://gis.stackexchange.com/a/334307/144357

Use geopandas with the .to_crs() method.

https://geopandas.readthedocs.io/en/latest/projections.html

import geopandas
import pandas
df = pandas.read_csv(...)
gdf = df.set_geometry(geopandas.points_from_xy(df.LONGITUDE, df.LATITUDE), crs="EPSG:4326")
projected_df = gdf.to_crs("EPSG:3395")
df["x"] = projected_df.geometry.x
df["y"] = projected_df.geometry.y
df.to_csv(...)

edited Mar 11 '22 at 15:40

Community

1

answered Feb 17 '20 at 16:53

snowman2

7,321
12
29
54

Thanks! I will try this in the future. – lisa Feb 18 '20 at 09:52

score 0 · Answer 2 · answered Feb 17 '20 at 16:44

Just a shot in the dark here as I am not able to reproduce the code you have provided at the moment.

The error suggests you should try to open the file in text mode. From the Python documentation on the open() function:

The first argument is a string containing the filename. The second argument is another string containing a few characters describing the way in which the file will be used. mode can be 'r' when the file will only be read, 'w' for only writing (an existing file with the same name will be erased), and 'a' opens the file for appending; any data written to the file is automatically added to the end. 'r+' opens the file for both reading and writing. The mode argument is optional; 'r' will be assumed if it’s omitted.

On Windows, 'b' appended to the mode opens the file in binary mode, so there are also modes like 'rb', 'wb', and 'r+b'. Python on Windows makes a distinction between text and binary files; the end-of-line characters in text files are automatically altered slightly when data is read or written. This behind-the-scenes modification to file data is fine for ASCII text files, but it’ll corrupt binary data like that in JPEG or EXE files. Be very careful to use binary mode when reading and writing such files. On Unix, it doesn’t hurt to append a 'b' to the mode, so you can use it platform-independently for all binary files.

Basically you are opening both files (reader and writer) in binary mode. You can test whether opening them in text mode will work. Simply change:

with open(path.join(out_path, csvfile.replace('.csv','_project.csv')), 'wb') as w:

for

with open(path.join(out_path, csvfile.replace('.csv','_project.csv')), 'w') as w:

and

with open(path.join( in_path, csvfile), 'rb') as r:

for

with open(path.join( in_path, csvfile), 'r') as r:

Let me know if this helps.

@lisa I'm glad it worked. Feel free to take a look at What should I do when someone answers my question? for future questions. — Marcelo Villa, Feb 18 '20 at 23:20

Making a coordinate transformation on a csv file using pyproj?

2 Answers2