1

I have a 3rd party application that I cannot change that runs a "bad" select query that can run very slowly in some circumstances (we have killed it after 6 hrs on occasion). This is after we've done what we could to speed it up via indexing, which is something we can control.

    SELECT *
    FROM findetail 
    WHERE fintransact is null 
    and (validation is null or validation <= ' ' or validation = ' Zero value transaction;') 
    and (exists (select * 
                 from counterparty 
                 where counterparty.counterparty = findetail.counterparty 
                 and counterparty.status = 'ACTIVE')  
          or findetail.counterparty is null) 
    and (findetail.acctstatus = 'FINAL' or findetail.acctstatus = 'PROVISIONAL') 
    and (findetail.transactiontype = 'AP' or findetail.transactiontype = 'AR') 
    and ((findetail.position not in (select position.position 
                                     from position 
                                     where position.positionmode = 'FINANCIAL' 
                                     and position.exchange is not null)
                                     ) 
          or (findetail.fee in (select fee 
                                from fee, feetype 
                                where findetail.fee = fee.fee 
                                and fee.feetype = feetype.feetype 
                                and feetype.excludemargin = 1 )
             )
       )
    and ( findetail.counterparty = 'ACPTY' )

It is easy to re-write the query to be at least somewhat "better":

SELECT *
FROM findetail 
WHERE fintransact is null 
and (validation is null or validation <= ' ' or validation = ' Zero value transaction;') 
and (exists (select * 
             from counterparty 
             where counterparty.counterparty = findetail.counterparty 
             and counterparty.status = 'ACTIVE')  
       or findetail.counterparty is null) 
and (findetail.acctstatus = 'FINAL' or findetail.acctstatus = 'PROVISIONAL') 
and (findetail.transactiontype = 'AP' or findetail.transactiontype = 'AR') 
and (not exists (select position.position 
                 from position 
                 where position.positionmode = 'FINANCIAL' 
                 and position.exchange is not null
                 and position = findetail.position)
      or exists (select fee 
                 from fee, feetype 
                 where findetail.fee = fee.fee 
                 and fee.feetype = feetype.feetype 
                 and feetype.excludemargin = 1 
                 and findetail.fee = fee)
   )
AND ( findetail.counterparty = 'ACPTY' )

This one is still not perfectly optimal but runs in about 8s which is much preferable to multiple hours. It also keeps the rough structural format of the original which is important because a last-ditch option is to try and edit the query string (which is visible but broken into sections as it’s dynamically built up using some “if” statements in the code) in one of the third party application’s .dll’s.

Aside: When I completely re-wrote the query to be "even better" (down to 3s runtime) using proper joins, SQL server would no longer recognize this "even better" execution plan as valid for the "bad" query when I tried to use it via USE PLAN query hint. Plus, it wouldn’t be possible to edit that one back into the .dll if I end up trying that.

So I captured the execution plan of the "better" query, and tried to force it onto the "bad" query using the USE PLAN hint. SQL Server accepts the forced plan as valid (at least, doesn't complain like it did for the "even better" plan), but it also doesn't actually use the "better" plan. Show estimated execution plan shows the same bad plan as before, and actually executing results in the query running for at least 10 minutes before I killed it.

I've tried this in SSMS with both a USE PLAN hint at the end, and through creating a stored query plan. Either way, SQL Server when the query is run continues to use the "bad" plan. Any ideas on why if it thinks the plan is valid it’s still not using it?

Any help is appreciated, Thanks!

Guest
  • 23
  • 3
  • 2
    I think everyone should watch SQL Query Optimization: Why is it so hard to get right. Estimated and Actual Execution plans will give you different results with the latter being the accurate one. If the plan is in still in the plan cache, it is still valid and thus could be used. – S3S Oct 24 '19 at 20:51
  • I have cleared the plan cache using DBCC FREEPROCCACHE and it doesn't appear to make any difference, re-running afterwards still does not use the optimized plan. – Guest Oct 24 '19 at 21:05
  • If you cleared the proc cache, there is no plan for it to use. It has to generate a new one. The plan it chooses to create is up to the optimizer and the video explains why that is hard. – S3S Oct 24 '19 at 21:14
  • ok, but isn't the USE PLAN hint supposed to supply the plan, rather than the optimizer trying to generate one? Isn't that the whole point? – Guest Oct 24 '19 at 21:33
  • I see what you are saying now and yes it should – S3S Oct 24 '19 at 21:35
  • Do you have both plans available to share? – MJH Oct 24 '19 at 21:48
  • Yes, but I am unsure how to get them here... Character limit prevents posting in-line, and I don't see a way to attach them. My company firewall blocks all dropbox-like sites. – Guest Oct 24 '19 at 22:03
  • 2
    Pastetheplan is a good site to use – S3S Oct 24 '19 at 23:25
  • SQL server need to be able to come up with that plan on its own. That is how USE PLAN works. This is so you can't change the semantics of a query using a plan guide. My guess is that this is where it goes wrong for some reason. Perhaps it takes too long for it to do that, so it uses whatever it came up with. Seeming that plans might give us more info (perhaps a timeout), but those links provided assumed these are some sound files. Consider posting at pastetheplan. – Tibor Karaszi Oct 25 '19 at 08:17
  • Bad Plan: https://www.brentozar.com/pastetheplan/?id=H1CuKKl9H – Guest Oct 25 '19 at 14:41
  • 2
    The "bad plan" is a SELECT * but the "good plan" is a SELECT COUNT(*) so they are not the same queries. – MJH Oct 25 '19 at 15:21
  • My apologizes, I uploaded the wrong file (obviously sql server would complain of an invalid plan if I was trying to use that one!). Here is the correct "better" one: https://www.brentozar.com/pastetheplan/?id=SkrlD5x9S – Guest Oct 25 '19 at 15:39

2 Answers2

1

For the purposes of answering, let's label these as Query A and Query B. Query A and Query B are not identical (even if they produce equivalent results).

You cannot force SQL to use a plan generated for Query B when asking it to execute Query A.

USE PLAN is for running a specific plan when multiple plans exist for a single query. https://www.brentozar.com/archive/2018/03/why-multiple-plans-for-one-query-are-bad/

Rob Pearson
  • 126
  • 4
  • Ok, thanks, that makes sense... damn, no way to "fix" the bad query then, short of doing "bad" things" to the .dll. Argh. Thanks everyone for the help. – Guest Oct 28 '19 at 14:45
  • I've never tried to do what you're looking to do, but there are a fair number of tuning improvements that you can typically make to a SQL install. There's also crazier ideas like building an interception program to listen in between SQL and your problem program and substitute the better query. Is the 3rd party defunct and so you can't ask them to fix it? – Rob Pearson Oct 28 '19 at 17:00
  • Also, could you try downloading sentryone plan explorer and posting the graphic it generates of the bad plan? I'd like to take a look as a second pair of eyes on the indexing. – Rob Pearson Oct 28 '19 at 17:02
  • Thanks Rob, appreciate you taking the time! I'll see about sentryone, we may have it already, I'll ask our DBA. The vendor is not defunct, but I reported this to them about 3 years ago (and poke them about it once in a while) and it's still on their list of "enhancement requests to take into consideration for future releases" list. Also, it's a large entrenched system that will take about 3-6 months to test and release a new version into our Prod, so it's a once every couple of years sort of thing. – Guest Oct 28 '19 at 17:12
  • 1
    Hey Rob: posted as an answer below as the answer allowed image post, but comments don't seem to. – Guest Oct 28 '19 at 18:30
  • Thank you for posting that graphic. Makes it much easier for me for suggestions. If you look at the highlighted effort percentages you can see that the issue is pretty concentrated on one branch (but dang that is an ugly query plan). When I have some more time I'll try to come up with some indexing suggestions, but in the meantime that's where I recommend you focus your efforts. You could also download BrentOzar's Blitz script to help you identify common stability and performance consistency tweaks for your server. – Rob Pearson Oct 28 '19 at 19:13
  • So sp_blitz doesn't show anything amiss, and it shows that the query is hitting the indexes. The issue is that it runs the subquery on position (which by itself runs in 1 second, and always returns same results) for every record in the outer query (which is why it can be either fast or slow with the same execution plan, it depends on how many records exist for the counterparty in question). For relatively small sets of data, the query runs in seconds, for counterparties with lots of records in the outer table, it can take hours. Easy to fix if you can edit the query, but as-is, not so much. – Guest Oct 31 '19 at 16:22
  • As Blitz is focused on overall server configuration, BlitzCache and BlitzIndex would be the next steps for inspecting what sorts of things can be done here without a query rewrite. https://www.brentozar.com/blitzcache/eager-index-spools/ What's your TembDB configuration like (size / how many files / storage medium)? That Eager Index Spool is absolutely enormous at 250+ billion. – Rob Pearson Oct 31 '19 at 18:20
  • 1
    I ran Blitzindex the other day and it didn't find anything. However, after your suggestion I just ran blitzcache, and low-and-behold it suggested a new (and very lightweight) index that has sped up the query in my initial testing by 93%. That'll probably hold us for now if it works as well in Prod. Thanks for all your help, I'll report back if this doesn't work! – Guest Nov 01 '19 at 15:08
  • This recently came up in my news feed and reminded me of this question, specifically the high-cost eager index spool in your query plan. https://sqlperformance.com/2019/11/sql-performance/eager-index-spool-optimizer – Rob Pearson Nov 25 '19 at 23:55
0

enter image description here

Rob, here is the plan from sentryone plan explorer:

Guest
  • 23
  • 3