6

I have a simple query and table, and I would like to know which indexing is efficient on this kind of table and query.

In my table I have 3 columns

CREATE TABLE mYTable(ipFrom BIGINT, ipto BIGINT, url NVARCHAR(255))

and I am running this simple query.

SELECT url 
FROM MyTable 
WHERE ipto <= somevalue AND ipfrom >= somevalue

I have also created indexes on all 3 columns clustered on ipFrom and non-clustered on rest of 2. But this query giving me really poor performance in terms of CPU & reads.

Any suggestions.


What i have implemented is i like to redirect user based on the IP address. I have stored several IP ranges from different areas and states and redirect users on the basis of their IP to appropriate URL.

Yes i think i inserted the data in wrong manner in both columns(and will do column rename later), but point here is to minimize the CPU.

When i look into the execution plan there it converts data in where clause, i dont know why it is converting data in where clause. There is something like this

|--Clustered Index Seek(OBJECT:([T].[TC]), SEEK:([T].[C] > Convert([@V])

Jack Douglas
  • 39,869
  • 15
  • 101
  • 176
Ashish
  • 727
  • 2
  • 14
  • 25
  • 2
    What is the table's PK? Can there be more than one match for a given IP? Also shouldn't it be the other way round? ipto >= somevalue AND ipFrom <= somevalue – Martin Smith Mar 13 '12 at 07:46
  • 1
    Is this somevalue BETWEEN ipfrom and ipto (as per @MartinSmith or somevalue NOT BETWEEN ipfrom and ipto (as per your question)? – gbn Mar 13 '12 at 09:12
  • Can you give some idea about the distribution of (ipto-ipFrom) in your data, in other words, is the data random or are some range sizes more common that others? – Jack Douglas Mar 13 '12 at 09:41
  • What i have implemented is i like to redirect user based on the IP address. I have stored several IP ranges from different areas and states and redirect users on the basis of their IP to appropriate URL. Yes i think i inserted the data in wrong manner in both columns(and will do column rename later), but point here is to minimize the CPU. When i look into the execution plan there it converts data in where clause, i dont know why it is converting data in where clause. There is something like this |--Clustered Index Seek(OBJECT:([T].[TC]), SEEK:([T].[C] > Convert([@V]) – Ashish Mar 13 '12 at 10:51
  • It is because it is converting @V to bigint from int probably but that isn't the source of your problem. The problem is that the seek is only bounded on one end and so on average it will scan half the index testing every row meeting the >@V criteria to see if it meets the < criteria (presumably most of them won't). Jack's solution should work for you in minimising the number of rows scanned. – Martin Smith Mar 13 '12 at 10:54

2 Answers2

6

I'm assuming there is a typo in the question as per Martin's comment.

A problem with a single range seek on either ipFrom or ipTo is that on average it will need to seek half the table. An alternative approach is

1) Add a computed column, on something like:

 ALTER TABLE MyTable
  ADD granule 
AS CASE WHEN (ipTo-ipFrom)<16 THEN 1 WHEN (ipTo-ipFrom)<256 THEN 2 ELSE 3 END;

2) Index the computed column and ipFrom

CREATE ClUSTERED INDEX IX ON MyTable(granule, ipFrom ); 

3) Query like this:

SELECT url
FROM   MyTable
WHERE  granule = 1
       AND ipFrom BETWEEN @somevalue - 16 AND @somevalue
       AND ipTo >= @somevalue
UNION ALL
SELECT url
FROM   MyTable
WHERE  granule = 2
       AND ipFrom BETWEEN @somevalue - 256 AND @somevalue
       AND ipTo >= @somevalue
UNION ALL
SELECT url
FROM   MyTable
WHERE  granule = 3
       AND ipFrom <= @somevalue
       AND ipTo >= @somevalue;

Precisely how you define the 'granule' function will depend on the spead of ipto-ipfrom in your data.

Jack Douglas
  • 39,869
  • 15
  • 101
  • 176
1

This matches your current logic (untested), but see my comment on question.
What I've done is reverse the filter to exclude unwanted rows: now you can index ipto and ipfrom separately (of course you need url too in the index unless you have a PK that can substitute)

I'd suggest this for SQL Server 2005+

SELECT url 
FROM MyTable 

EXCEPT
SELECT url 
FROM MyTable 
WHERE ipto > somevalue

EXCEPT
SELECT url 
FROM MyTable 
WHERE ipfrom < somevalue

However, for SQL Server 2000

SELECT url 
FROM MyTable M
WHERE
   NOT EXISTS (SELECT *
        FROM MyTable M2
        WHERE ipto > somevalue AND M.url = M2.url)
   AND
   NOT EXISTS (SELECT *
        FROM MyTable M3
        WHERE pfrom < somevalue AND M.url = M3.url)
gbn
  • 69,809
  • 8
  • 163
  • 243