Paging is difficult in SQL Server 2000 for the following reasons:
-You cannot use a variable with TOP
-You cannot specify an offset with TOP
Here are a few common approaches:
1) Nested tops. Extract 1..slice_end ORD ASC, select 1..slice_size ORD DESC, select ORD ASC
-This results in sorting the data 3x, and requires dynamic generation.
2) select the top @slice_size, that aren't in the set (1..@slice_size-1)
-This performs badly because of the subselect, and requires dynamic generation
3) Select based on one or several sort columns.
-Set ROWCOUNT to @slice_start, record the values for each sort columns (must result in a unique entry)
-Set ROWCOUNT to @slice_size, and set the WHERE clause to use the values recorded in the previous step
-This solution performs well, but becomes difficult with dynamic sorting
4) Do a self-join and select based on the count() of rows that follow this one.
-Very inefficient for large datasets
A few useful things to remember:
- 'SET ROWCOUNT' does allow variables.
- You can use 'SET ROWCOUNT' to restrict the number of rows affected by INSERT, UPDATE, DELETE
With these things in mind I came up with the following strategies:
1) Strategy1:
For start 5, slice size 10, end 15:
Select the results from 1..15 into @paging_buffer
Assign a sort_idx to each row
Select the rows with a sort_idx > @slice_start
2) Strategy 2
For start 5, slice size 10, end 15:
Select the results from 1..15 into @paging_buffer
Set the ROWCOUNT to @slice_start -1
Delete from @paging_buffer
Strategy 1 took 18 seconds for my sample dataset of 5700 records, mostly due to the
iteration for assigning the sort_idx. There isn't a clean way to handle this that I found unfortunately.
Strategy 2 took 1 second for my sample dataset of 5700 records, and would be lower without some of the computation in the initial loop.
An important note with strategy 2 is that if the start_slice = 1 you do not want to set the ROWCOUNT to 0 before deletion, as this will purge all the records from the buffer. This is bad.
It's also recommended that you not handle this case with an IF clause, as this will cause recompilation for each run.
Do this instead:
DELETE FROM @paging_buffer where @discard_up_to > 0
Example implementation:
CREATE PROCEDURE [dbo].[paging_example]
@slice_size int = 0, -- ROWCOUNT 0 will return all results
@slice_start int = 1
AS
-- Create a buffer for the records from 1..@slice_end
declare @paging_buffer table(
user_id int,
firstname varchar(50),
lastname varchar(50)
);
declare @slice_end int
set @slice_end = @slice_start + @slice_size - 1
-- SELECT from 1 til the end of the slice into @paging_buffer
SET ROWCOUNT @slice_end
insert into @paging_buffer
SELECT
user_id, firstname,lastname
FROM Users
ORDER BY lastname
--This approach will delete the rows we don't need from the buffer
-- If we're starting at #1 then we don't need to do anything special
-- Delete will go in order, so if we delete up to the record before the start of the slice then only the desired records will remain
DECLARE @discard_up_to int
SET @discard_up_to = @slice_start - 1
SET ROWCOUNT @discard_up_to
-- If the start_slice is 1, then setting the rowcount to 0 will result in a complete delete - we don't want this
-- But we also do not want to prevent if with an IF clause, because when a stored procedure uses the 'IF' clause it will be recompiled before each run.
-- So instead we check "where @discard_up_to > 0" in the delete clause
DELETE FROM @paging_buffer where @discard_up_to > 0
SET ROWCOUNT 0
select * from @paging_buffer
GO
EXEC paging_example @slice_start = 50, @slice_size = 10