4

I have a table set up like this:

The DB has 10B rows.

CREATE TABLE [Tick].[X_H](
       [utcDT] [datetime2](7) NOT NULL,
       [Symbol] [nvarchar](50) NOT NULL,
       [Bid] [float] NULL,
       [Ask] [float] NULL,
       [BidSize] [float] NULL,
       [AskSize] [float] NULL
) ON [PRIMARY]

ALTER TABLE [Tick].[X_H] ADD CONSTRAINT [PK_Master] PRIMARY KEY CLUSTERED ( [utcDT] ASC, [Symbol] ASC )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = ON, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON)

I want to select a single day's worth of data for a particular Symbol. I am using:

SELECT *  
FROM [Tick].[X_H] 
WHERE [Symbol] = 'DONKEY' 
AND CONVERT(Date, [utcDT]) = CONVERT(Date,'2011-01-02');

This takes 3 times longer than reading the same data from a binary file and deserialising it. Binary read is 42ms and SQL read is 115ms.

What might I try to speed it up?


The suggestion was made to create a separate [Date] column, which I have and I added 2 indexes.

ALTER TABLE [Tick].[FX_HS] ADD [Date] DATE NULL;
UPDATE [Tick].[X_H] SET [Date] = CONVERT(Date,[utcDT])
CREATE INDEX I_Date ON [Tick].[X_H] ([Date]); 
CREATE INDEX I_SymbolDate ON [Tick].[X_H] ([Symbol],[Date]);

And I am using query:

SELECT *  FROM [Tick].[X_H] WHERE [Symbol] = 'DONKEY' AND [Date] = '2011-01-02';

But it takes approximately the same amount of time!


Execution plan

Paste the Plan

Query 1: Query cost(relative to batch): 100%
SELECT * FROM [Tick][X_H] WHERE [Symbol]=@1 AND [Date]=@2

enter image description here


As Martin Smith suggested:

SET STATISTICS TIME ON

DECLARE @utcDT DATETIME2(7) DECLARE @Symbol NVARCHAR(50) DECLARE @Bid FLOAT DECLARE @Ask FLOAT DECLARE @BidSize FLOAT DECLARE @AskSize FLOAT DECLARE @Date DATE

SELECT @utcDT = utcDT, @Symbol = Symbol, @Bid = Bid, @Ask = Ask, @BidSize = BidSize, @AskSize = AskSize, @Date = Date FROM Tick.X_H WHERE ( Symbol = 'DONKEY' ) AND ( Date = '2011-01-02' )

This takes:

SQL Server parse and compile time: 
   CPU time = 0 ms, elapsed time = 0 ms.

SQL Server Execution Times: CPU time = 47 ms, elapsed time = 43 ms.

ManInMoon
  • 345
  • 1
  • 4
  • 9
  • 1
    The execution plan looks pretty optimal and the majority of the time is spent transferring the data to the client and waiting for the client to process it. Do you really need *? You already know the value for symbol as that is in the initial query and you are returning it in all 166,000 rows. – Martin Smith Mar 04 '17 at 18:05
  • @MartinSmith Actually, you put me on the right path! I was using LIKE to find [Date], when I changed to = the query ran very much faster. – ManInMoon Mar 07 '17 at 08:37

1 Answers1

8

Your where condition that checks for equality for utcdt converted to a date is SARGable, but uses a dynamic seek.

The problem is that these hidden implicit conversions can result in inaccurate cardinality and distribution estimations at any stage of the plan. So, even if you get a seek, the plan might be way off overall. - Dynamic Seeks and Hidden Implicit Conversions - Paul White

Try converting your where condition to an explicit range and see if there is any improvement:

select *
from Tick.X_H
where Symbol = 'donkey'
  and utcdt >= convert(datetime2(7),'20110102')
  and utcdt <  convert(datetime2(7),'20110103')

Based on this answer by Martin Smith the explicit range may reduce unnecessary reads.

References:

SqlZim
  • 2,420
  • 1
  • 11
  • 22