4

I am collecting start and end points for a large polyline feature class in a file geodatabase. I am using arcpy, ArcGIS 10.3, Python 2.7.3 64-bit and a 10.3 compressed file geodatabase with just one feature class (and no other objects).

One accesses the geometry properties with the SHAPE@ token which gives access to the geometry of the line (namely firstPoint and lastPoint).

with arcpy.da.SearchCursor(fc, ["OID@", "SHAPE@"],query) as search_cur:
        allLinesList = [[row[0],(row[1].firstPoint.X,row[1].firstPoint.Y),
                         (row[1].lastPoint.X,row[1].lastPoint.Y)]
                            for row in search_cur]

Reading 1 million line features (note that my lines consist of only two vertices, start and end) takes ca 5 minutes. I have tried using for loop iterating a cursor instead of comprehension and iterating features in chunks of several hundred thousands with no significant difference in time.

Do you know any other faster way to read XY of the start and end points of the lines?

PolyGeo
  • 65,136
  • 29
  • 109
  • 338
Alex Tereshenkov
  • 29,912
  • 4
  • 54
  • 119
  • Do you need to use a compressed geodatabase? – PolyGeo Feb 27 '15 at 11:34
  • @PolyGeo, no I don't - compressed file geodatabase might provide faster read-only access that is why I am compressing it – Alex Tereshenkov Feb 27 '15 at 11:46
  • I was thinking the opposite but have not tested to see which is faster. – PolyGeo Feb 27 '15 at 11:57
  • I'm on iPhone so going from memory but if lines all have just start and end points and no intervening vertices could you convert them to points and then use a tool to add XY fields and read those fields any quicker overall? – PolyGeo Feb 27 '15 at 12:18
  • if your lines are segments, just calling the cursor could do the job, but I don't understand exactly what output you want from your list. listLines = arcpy.da.SearchCursor(fc, ["OID@", "SHAPE@WKT"],query) – radouxju Feb 27 '15 at 12:38
  • 3
    300 seconds doesn't seem outrageously long to access 1m lines through a cursor. I would suggest you try with an uncompressed FGDB as well. On what kind of disk is the file? – Vince Feb 27 '15 at 12:56
  • ///PolyGeo, thanks for the tip. This might seem to be even longer to me, but I may be wrong. I have to keep the featureID, too, so this is going to be a lot of overhead. ///radouxju, interesting with SHAPE@WKT. I should test whether it will be faster to slice the WKT string getting the XY coordinates than accessing the geometry last and start points, thanks for the idea. ///Vince, yeah it is OK to wait, I was just wondering if there is any faster way and I've missed it. I've tried with uncompressed fgdb, was roughly the same result +-2% in time. I am on SSD drive. – Alex Tereshenkov Feb 27 '15 at 13:09

1 Answers1

1

Write a proxy class in managed C++ if this is an option for your. Realistically, you might need to perform the kind of operation that takes longer in .NET or Arcpy (large iterations of small operations).You have the choice to write a proxy class in managed C++ where you can place ArcObjects code that heavily uses the interop layer.

Once you expose this code as public methods, you can call the methods from your C# or VB .NET and even from arcpy using comtypes. Since the interop layer is only required when you call the methods of this proxy class, the execution time can be greatly reduced. For example a method for creating a line feature with 1 million records takes 30 seconds with .NET, but using managed C++, it takes 5 seconds (6 times faster).

A Proxy class is the ultimate choice where you can get the most out of ArcObjects. This is what Arcpy data access module does to improve performance.

Farid Cheraghi
  • 8,773
  • 1
  • 23
  • 53