2

I am trying to import a Delimited Text File Layer from a .csv file in QGIS. Because this caused a lot of errors, I posted a question here and got the hint towards using ASCII non-printable control characters as delimiters. These are described on Wikipedia and in this blog post.

Since I am using Tweets, for which it is very difficult to find a convenient delimiter because nearly all characters are used in them, I really want to give this a shot. However, I cannot find out how to import a text file into QGIS specifying these characters as delimiters. I tried the different spellings that are given in Wikipedia as custom characters and as regular expression.

0x31
0x1F
^_

None of these work. Interestingly, the lines are detected correctly - only the record delimiters are a problem.

Any ideas?

anjuta
  • 201
  • 2
  • 7

1 Answers1

2

Have you thought about using a combination delimiter? For instance, if you know $, ^, and # may be used in your string, make the delimiter #&^. I am not sure if QGIS supports multiple character delimiters directly. You can always call python's split("#&^") in order to manually split the string and handle individual values that way.

We ran into that issue and ended up using ¬ as a delimiter because we knew that would never be collected in a shapefile. We keep the same standing encoding but used an alt-key character.

Branco
  • 3,201
  • 1
  • 18
  • 37
  • I did exactly that, but there were a lot of import errors and I got the hint that maybe this was a problem. Although, QGIS seems to support this, but sometimes using multiple delimiters does not work, so I thought I'd rather not get used to it. – anjuta Jun 11 '15 at 20:39
  • Some of the problems we ran in to using an alt-key character was that sometimes certain GIS programs would interpret it wrong and mess up the encoding. Basically, using ¬ has led to issues trying to utilize the OLEDB driver in .NET to access the DBF because the encoding gets all screwy. I couldn't figure a way around it, but I am sure I just didn't adjust to the proper encoding for reading in the DBF to it. – Branco Jun 12 '15 at 12:21