2

I'm trying to access data that is stored in a MSSQL spatial database and it seems to me that OGR is having some problems with accents. I'm not sure if maybe there is an encoding parameter or something else I could try before I report an issue on github.

Under Windows 10 and chcp 1252, using this command ogrinfo --debug on MSSQL:"server=server;database=db;trusted_connection=yes;tables=schema.table" -sql "select distinct categorie from schema.table", all distinct values are returned as expected. Among these are a couple that contain french accents:

OGRFeature(SELECT):4
  categorie (String) = Véhicule électrique en recharge

OGRFeature(SELECT):5 categorie (String) = Véhicule électrique

OGRFeature(SELECT):6 categorie (String) = Régulier

OGRFeature(SELECT):8 categorie (String) = Réservé véhicule électrique

If I add this to the SQL query in the original command where categorie in ('Véhicule électrique en recharge', 'Véhicule électrique', 'Réservé véhicule électrique'), then OGR returns a Feature Count: 0 because the accents are not sent properly: MSSQLSpatial: ExecuteSQL(select distinct categorie from schema.table where categorie in ('Véhicule électrique en recharge', 'Véhicule électrique', 'Réservé véhicule électrique')) called.

If I change chcp to 65001, then the accents are sent properly MSSQLSpatial: ExecuteSQL(select distinct categorie from schema.table where categorie in ('Véhicule électrique en recharge', 'Véhicule électrique', 'Réservé véhicule électrique')) called., yet I still get a Feature Count: 0.

So with chcp still to 65001, I execute the original command again (only select distinct categorie) and then I get:

OGRFeature(SELECT):4
  categorie (String) = Vhicule lectrique en recharge

OGRFeature(SELECT):5 categorie (String) = Vhicule lectrique

OGRFeature(SELECT):6 categorie (String) = Rgulier

OGRFeature(SELECT):8 categorie (String) = Rserv vhicule lectrique

The "é" character is simply removed.

I can bypass this problem in both chcp by using where categorie like '%lectrique%', but I'd like to know if there is some way to send my query with accents and values with accents are returned.

mikibok
  • 47
  • 6
  • 1
    Have a try by keeping chcp 1252 and save the SQL into a text file as UTF-8. Then use the text file as -sql @stored_sql.txt. I have not tried that myself, just thinking that it could work. – user30184 Mar 06 '23 at 21:39
  • So with a file in UTF-8, the same problem happens: the "é" character is still being sent as "é". I then opened the file with the Windows 1252 encoding, and the "é" characted changed in the file for "é". Saved it in 1252, tried ogr again and still no features. Then in the same file, replaced the "é" for "é", saved it in 1252, and it works! So if you post your comment as an answer I will accept it as it is a good workaround solution. – mikibok Mar 07 '23 at 20:31

1 Answers1

1

Both ogrinfo and ogr2ogr support reading the SQL statements from a text file. https://gdal.org/programs/ogr2ogr.html#cmdoption-ogr2ogr-sql

-sql <sql_statement>

SQL statement to execute. The resulting table/layer will be saved to the output. Starting with GDAL 2.1, the @filename syntax can be used to indicate that the content is in the pointed filename.

The SQL file option can be used as a workaround. Save the statements into text file and experiment with what character encoding the text file must be saved so that the characters with accents and other special characters gets forwarded correctly to ogrinfo or ogr2ogr.

user30184
  • 65,331
  • 4
  • 65
  • 118