Partitioning

In this post I would like to continue with partitioning series. Previous post we created partitioned view and look how it looks like in execution plan. In this post we will create partitioned table.

In case of simple partitioned table, we will need to do following:

  • Create table
  • Create partitioning function
  • Create partitioned schema
  • Relationship between partitioned schema and the table
  • Let’s check created partition
  • Let’s verify partition query in execution plan

As in the partitioned view post there will be created 3 partitions within the same filegroup (without separated disks per partitioned). This solution will not profit from parallel reading from partitions separated on different disks. For simple demonstration it will fit.

Let’s create the table first

DROP TABLE [dbo].[PartitionedTable]
CREATE TABLE [dbo].[PartitionedTable] (id INT NOT NULL , booking_date DATE NOT NULL, data  SYSNAME)

ALTER TABLE [dbo].[PartitionedTable] ADD PRIMARY KEY CLUSTERED ( [id] ASC, [booking_date] ASC)

Create partitioned function. It defines partition data according to following boundaries 

<
'2019-10-01' , >= '2019-10-01' <
'2019-11-01' , >= '2019-11-01' <
'2019-12-01', >='2019-12-01'.

The side of boundaries values is defined by LEFT or RIGHT key word.

CREATE PARTITION FUNCTION PartitionedFunction (DATE)
AS RANGE  RIGHT FOR VALUES ('2019-10-01', '2019-11-01', '2019-12-01');

Create partition scheme to specify which partition of table or index belongs to which filegroup. All partitions are defined for Primary filegroup in our case.

CREATE PARTITION SCHEME PartitionedScheme 
AS PARTITION PartitionedFunction ALL TO ([PRIMARY])

Now let’s create relationship between table and partitioned schema.

ALTER TABLE [dbo].[PartitionedTable] DROP CONSTRAINT PK_PartitionedTable__booking_date

ALTER TABLE [dbo].[PartitionedTable] ADD CONSTRAINT PK_PartitionedTable__booking_date PRIMARY KEY CLUSTERED ([booking_date] ASC) ON [PartitionedScheme]([booking_date])

Here we have few queries to check our partitioned objects.

Let’s check connection between partition schema and partitioned table.

SELECT * 
FROM sys.tables AS t
JOIN sys.indexes AS i ON t.[object_id] = i.[object_id] AND i.[type] IN (0,1)
JOIN sys.partition_schemes ps ON i.data_space_id = ps.data_space_id
WHERE t.name = 'PartitionedTable';

View to check created partitions.

SELECT t.name AS TableName,
i.name AS IndexName,
p.partition_number,
p.partition_id,
i.data_space_id,
f.function_id,
f.type_desc,
r.boundary_id,
r.value AS BoundaryValue
FROM      sys.tables AS t
JOIN      sys.indexes AS i           ON t.object_id = i.object_id
JOIN     sys.partitions  AS p      ON i.object_id = p.object_id AND i.index_id = p.index_id
JOIN      sys.partition_schemes AS s ON i.data_space_id = s.data_space_id
JOIN      sys.partition_functions AS f ON s.function_id = f.function_id
LEFT JOIN sys.partition_range_values AS r ON f.function_id = r.function_id and r.boundary_id = p.partition_number
WHERE t.name = 'PartitionedTable' AND i.type <= 1
ORDER BY p.partition_number;

You can check partitioned column as well with following query.

SELECT
t.[object_id] AS ObjectID
, t.name AS TableName
, ic.column_id AS PartitioningColumnID
, c.name AS PartitioningColumnName
FROM sys.tables         AS t
JOIN sys.indexes         AS i ON t.[object_id] = i.[object_id] AND i.[type] <= 1 -- clustered index or a heap
JOIN sys.partition_schemes AS ps      ON ps.data_space_id = i.data_space_id
JOIN sys.index_columns AS ic ON ic.[object_id] = i.[object_id] AND ic.index_id = i.index_id AND ic.partition_ordinal >= 1 -- because 0 = non-partitioning column
JOIN sys.columns         AS c ON t.[object_id] = c.[object_id] AND ic.column_id = c.column_id
WHERE t.name = 'PartitionedTable'
;

Check Execution plan

If we run following query check execution plan how it is look like.

SELECT * FROM [dbo].[PartitionedTable] WHERE booking_date >=
CAST ('2019-11-01' AS DATE) AND booking_date <= CAST ('2019-11-30' AS DATE)
Execution plan - Actual partition count
Execution plan – Actual partition count

We can see Actual Partition Count = 1 which is ok since we would like to get data just for one moth= one partition in our case.

Let’s notice RangePartitionNew function which contains ranges from all defined partitions. I would expect to see our predicate values, so why optimizer show this? The reason is that we use simple predicate in our query which leads to simple parametrization.

SELECT * FROM [dbo].[PartitionedTable] WHERE booking_date >= CAST ('2019-11-01' AS DATE) AND booking_date <= CAST ('2019-11-30' AS DATE)

Modify the query as follows

SELECT * FROM [dbo].[PartitionedTable] WHERE booking_date >= CAST ('2019-11-01' AS DATE) AND booking_date <= CAST ('2019-11-30'  AS DATE) AND 1<>2.

It eliminates simple parametrization and we get in execution plan what we expected, see plan bellow.

Execution plan  - Partition predicate
Execution plan – Partition predicate

Let’s try to modify query now u can see in execution plan that partition count value is 2.

SELECT * FROM [dbo].[PartitionedTable] WHERE booking_date >= CAST ('2019-11-01' AS DATE) AND booking_date < CAST ('2019-12-01' AS DATE)
Execution plan - Actual Partition Count
Execution plan – Actual Partition Count

It is important how the predicate is set. Since above mentioned query touches two partitions instead of 1.

That was very quick introduction to SQL Sever partitioning.

It would be nice to look at partitioning little bit deeper in one of my next post and try to compare some scenarios with partitioned views approach or how RangePartitionNew function works internally.

Stay tuned.

Leave a Reply

Your email address will not be published. Required fields are marked *