I'm testing different architectures for large tables and one suggestion that I've seen is to use a partitioned view, whereby a large table is broken into a series of smaller, "partitioned" tables.
In testing this approach, I've discovered something that doesn't make a whole lot of sense to me. When I filter on "partitioning column" on the fact view, the optimizer only seeks on the relevant tables. Additionally, if I filter on that column on the dimension table, the optimizer eliminates unnecessary tables.
However, if I filter on some other aspect of the dimension the optimizer seeks on the PK/CI of each base table.
Here are the queries in question:
select
od.[Year],
AvgValue = avg(ObservationValue)
from dbo.v_Observation o
join dbo.ObservationDates od
on o.ObservationDateKey = od.DateKey
where o.ObservationDateKey >= 20000101
and o.ObservationDateKey <= 20051231
group by od.[Year];
select
od.[Year],
AvgValue = avg(ObservationValue)
from dbo.v_Observation o
join dbo.ObservationDates od
on o.ObservationDateKey = od.DateKey
where od.DateKey >= 20000101
and od.DateKey <= 20051231
group by od.[Year];
select
od.[Year],
AvgValue = avg(ObservationValue)
from dbo.v_Observation o
join dbo.ObservationDates od
on o.ObservationDateKey = od.DateKey
where od.[Year] >= 2000 and od.[Year] < 2006
group by od.[Year];
Here's a link to the SQL Sentry Plan Explorer session.
I'm working on actually partitioning the larger table to see if I get partition elimination to respond in a similar fashion.
I do get partition elimination for the (simple) query that filters on an aspect of the dimension.
In the meantime, here's a stats-only copy of the database:
https://gist.github.com/swasheck/9a22bf8a580995d3b2aa
The "old" cardinality estimator gets a less expensive plan, but that's because of the lower cardinality estimates on each of the (unnecessary) index seeks.
I'd like to know if there's a way to get the optimizer to use the key column when filtering by another aspect of the dimension so that it can eliminate seeks on irrelevant tables.
SQL Server Version:
Microsoft SQL Server 2014 - 12.0.2000.8 (X64)
Feb 20 2014 20:04:26
Copyright (c) Microsoft Corporation
Developer Edition (64-bit) on Windows NT 6.3 <X64> (Build 9600: ) (Hypervisor)






CREATE STATISTICS [_WA_Sys_00000008_2FCF1A8A] ON [dbo].[Observation_2010]([StationStateCode]) WITH STATS_STREAM = 0x01000000010000000000000000000000D4531EDB00000000D5080000000000009508000000000000AF030000AF000000020000000000000008D000340000000007000000E65DE0007DA5000076F9780000000000867704000000000000000000ABAAAA3C0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000– Kin Shah Jan 06 '16 at 00:16ObservationDatestable. I'm not getting the same plan as Paul, even with 4199, and I think this is why. – Geoff Patterson Jan 06 '16 at 14:30however, as Kin noted, the last stats stream is corrupted :/
– swasheck Jan 06 '16 at 15:17ObservationDates. I ended up runningUPDATE STATISTICS ObservationDates WITH ROWCOUNT = 10000manually in order to get the plan that Paul demonstrated though. – Geoff Patterson Jan 06 '16 at 15:22ObservationDatesso i'm not sure what's going on with that. also, i'm not able to get the plan paul generated either. i'll try the update to see. – swasheck Jan 06 '16 at 15:28