8

I have a deadlock graph from a deadlock where one process is doing a SELECT and one is doing an UPDATE. This seems like the classic case where the SELECT gets a NCI lock to perform a join and then a CI lock to retrieve all the data by lookup. And the UPDATE is using the CI lock to perform an update and then needs to lock an NCI because the update results in a status change and the NCI facilitates finding items by status.

The problem is that one of the locks the UPDATE wants is NOT on the table it is updating and I can't find why this is happening.

Here's the SELECT:

SELECT *,
       RIGHT(c.CC_NUMBER, 4) AS CC_LAST_4,
       DATEDIFF(ss, '1970-01-01', plan_started ) plan_started_epoch,
       DATEDIFF(ss, '1970-01-01', plan_expires ) plan_expires_epoch
FROM customers c, accounts a, parent_cos pc, htt_customers_overlay_ultra u
WHERE c.customer_id = a.customer_id
AND   u.customer_id = c.customer_id
AND   a.cos_id=pc.cos_id
AND   u.customer_id = 9300;

Here's the UPDATE:

UPDATE htt_customers_overlay_ultra SET plan_state = 'Active'  WHERE customer_id = 9300;

But according to the deadlock graph, the UPDATE is acquiring a lock on ACCOUNTS.ACCOUNT0, which is the PK (CI) of the ACCOUNTS table. There are no foreign keys in the overlay table. There are some default constraints which I don't currently have permission to see.

I've looked at the deadlock graph in SSMS and in SQL Sentry Plan Explorer Pro and am none the wiser.

Here are the execution plans:

For the SELECT

For the UPDATE

I'd like to find out why it is getting this lock, and then the best way to serialize these calls.

Things I am aware of which I have already advised the client which have bearing on the locks taken, but don't explain the seeming unrelated lock which is arising:

Remove * and identify the columns needed and alter the NCIs to become covering - this would potentially make the SELECT use fewer locks

Determine why the system is SELECTing the same data that another process is processing - this would potentially mitigate these two processes running at the same time at all

There is a table scan in the SELECT

Cade Roux
  • 6,602
  • 1
  • 31
  • 54
  • Could the update query have acquired the X key lock on "dbo.ACCOUNTS" from an earlier statement in the same transaction not shown in the graph? – Martin Smith Jan 19 '13 at 19:56
  • If I'm reading the graph right, the SELECT has a U lock on the PK index of htt_customers_overlay_ultra -- why? That process has 0 log used. – Jon Seigel Jan 19 '13 at 20:28
  • @JonSeigel - No the update (process589f948) has a U lock and is trying to convert it to an X lock but is blocked by an S lock held by the SELECT (process5240988) – Martin Smith Jan 19 '13 at 20:33
  • @Cade Roux: Assuming Martin Smith found the culprit, I guess the easiest way to avoid such problems is to enable read_committed_snapshot for db. – a1ex07 Jan 19 '13 at 22:17
  • Selects issued by scalar UDFs do hold locks, yet do not show up in execution plans. Maybe your update fires a trigger which uses a scalar UDF? – A-K Jan 19 '13 at 22:29
  • @a1ex07 the following may be a breaking change: "enable read_committed_snapshot ". Do that, and you may end up with lots of subtle bugs. Use SNAPSHOT instead. – A-K Jan 19 '13 at 22:30
  • @a1ex07 Yes, I've recommended them looking into either - right now snapshots are not enabled on that db. – Cade Roux Jan 19 '13 at 22:38
  • @AlexKuznetsov I don't think they use scalar UDFs and there are no triggers. – Cade Roux Jan 19 '13 at 22:39
  • @AlexKuznetsov: You might or might not be correct about subtle bugs - that, in my opinion, depends on application design. In any case, such bugs are fixable... – a1ex07 Jan 19 '13 at 22:44
  • Just to keep things in perspective, Alex also suggested to avoid SQL Server 2012 because there might be bugs... perhaps a little too much tin foil up top. :-) – Aaron Bertrand Jan 20 '13 at 01:52
  • Ah, I see now. I should have looked at the XML before commenting. Thanks, @Martin. – Jon Seigel Jan 20 '13 at 15:35

1 Answers1

6

The UPDATE query has an X lock on a key on "dbo.ACCOUNTS" blocking the SELECT from getting an S lock.

The SELECT query has an S lock on a key of htt_customers_overlay_ultra. The UPDATE query has a U lock on the same key and is blocked trying to convert that to an X lock.

The execution plan for the UPDATE doesn't feature Accounts at all so there is no obvious reason for it to have a key lock on Accounts. The Update transaction starts 0.01 seconds before the batch does. 2013-01-13 08:49:30.213 vs 2013-01-13 08:49:30.223.

Perhaps there was a preceding statement in a different batch (hence not shown in the deadlock graph) that actually started the transaction and acquired the mysterious X lock.

Martin Smith
  • 84,644
  • 15
  • 245
  • 333
  • I believe all their code is implicit transactions in single adhoc batches generated by PHP. I'm thinking there has to be some other constraint, but I don't see any foreign keys. There are some default constraints which I am trying to get the definitions for, – Cade Roux Jan 19 '13 at 21:34
  • @CadeRoux - How can the transaction start before the batch then? If the UPDATE to htt_customers_overlay_ultra had previously updated Accounts in the same transaction it would of course explain it (though it would have needed to be a non updating update as log used = 0) or maybe a SELECT with an explicit XLOCK hint. – Martin Smith Jan 19 '13 at 22:03
  • I don't think they are doing any transactions - I think these two queries fired simultaneously from different parts of the app (which seems odd since they are both restricted to a single account - or perhaps it's multiple components of the same page loading asynchronously with multiple requests) and they were unlucky enough (or perhaps it's common because the post of the change is running simultaneous to the load of data on the account for another part of the page) to deadlock. – Cade Roux Jan 19 '13 at 22:04
  • 2
    @Martin Smith: Your explanation seems very logical to me. One more thing that probably proves your answer - in the dump for second query(update) trancount="2". I believe it means that there were 2 begin transaction... That in turns can explain why dump doesn't have update accounts... – a1ex07 Jan 19 '13 at 22:05
  • @a1ex07 - Well spotted. If @@trancount = 2 there must be explicit begin tran somewhere AFAIK. – Martin Smith Jan 19 '13 at 22:10
  • And I wouldn't just believe them when they say they're not using explicit transactions. Capture their batches in profiler and look for BEGIN TRANSACTION, also check the application code for client-side transaction handling, e.g. SQLConnection.BeginTransaction()... – Aaron Bertrand Jan 20 '13 at 00:49
  • @AaronBertrand Yeah, they may have some stuff in their PHP handling layer. – Cade Roux Jan 20 '13 at 04:09
  • Got final verification that they are indeed calling a begin_transaction function in their PHP layer which is resulting in a SQL Server transaction being created. – Cade Roux Feb 08 '13 at 21:31