2

Gdal is known for being the fastest library for geospatial data processing. I'm using it in my python script and I'm having issues with performance on a specific task which is Erasing a layer from another layer which takes too much time!

I used first this prebuilt function:

inputLayer.Erase(maskLayer, ResultLayer)

This works fine for a layer with average number of objects but when running it with a layer that has a 200,000 features it takes forever!

So I managed to write this bit of code:

        for featureI in inputLayer:
            geomI = featureI.GetGeometryRef()
            maskLayer.SetSpatialFilter(geomI)
            for featureM in maskLayer:
                geomM = featureM.GetGeometryRef()
                featureI.SetGeometry(geomI.Difference(geomM))
                selfLayer.SetFeature(featureI)

But this will last 1200 min which is too long for my case!

Is there any way to speed things up? Do you have any suggestions to make for me?

I'm using python3.5 with GDAL2,

lambertj
  • 3,037
  • 3
  • 18
  • 37
Zeus
  • 141
  • 1
  • 13
  • no it's not ! the other post solution is my problem ! can u read the two posts so that u can have a better understanding ? – Zeus Nov 17 '17 at 10:23
  • If you're interested in performance code in C++ or C#/VB.net, there are GDAL/OGR bindings available for these languages. You could be experiencing overhead by the to-and-fro from python interpretation to compiled C libs; I can personally attest that GDAL/OGR in C++ is significantly faster than in python, so much so that I only use GDAL in python now for very simple/small tasks.. – Michael Stimson Nov 17 '17 at 11:19
  • 2
    python as a language isn't built for speed, its main focus was/is ease of use, to get any real kind of performance a compiled language is necessary... after all you don't jump on a bicycle and expect to keep up with the traffic on the freeway. – Michael Stimson Nov 17 '17 at 11:32
  • I see , is there anyway i could use ogr2ogr for this task ? i can manage to save the mask and input layer as a shapefile and execute a commande line if it does exist !

    what do you think ?

    – Zeus Nov 17 '17 at 11:43
  • I've used OGR2OGR with -clipsrc and it's faster than Esri (by nearly 2x), I'm not sure about erasing with it though. I'll have a look at the docs and see if I can work out how to erase. – Michael Stimson Nov 17 '17 at 11:46
  • 1
    There's this post https://gis.stackexchange.com/questions/151699/can-ogr2ogr-reverse-clip-or-clip-out-or-erase-or-difference-one-shapef but I'm not sure that is of much help. It seems that -clipsrc will only retain features within the clip geometry. You could try sub-setting the features in the layer and run over many cores, I would need to know much more about your features and their storage type to guide you in this. – Michael Stimson Nov 17 '17 at 12:09
  • thanks for your effort ! i already checked that Topic , i'm using only shapefiles that i load to memory and save as shapefiles ( i'm open to suggestions ) – Zeus Nov 17 '17 at 13:05
  • You could try selecting completely within the extent to be erased, switch selection and erase the remainder: overlapping and completely outside. That would in the very least reduce the feature count to be erased. Apart from that I can't offer any more suggestions. – Michael Stimson Nov 17 '17 at 13:19
  • 1
    i resolved my performance issue by using parallel processing for now. next time i will be building a batch i'll think of using a compiled programming language , thanks Michael – Zeus Nov 29 '17 at 09:48
  • You're welcome. Can you post an answer for your own question with a basic code skeleton that worked for you. Parallel processing can be confusing, and there are a few traps, a working GIS solution would be great to see. – Michael Stimson Nov 29 '17 at 21:06

0 Answers0