When Eliminataing Cursors Hurts Performance
Not every non-cursor so9lution is performant. I think there are at least two different situations:
1. You open a cursor and use it to modify the underlying table.
2. You open a cursor, do something, but you do not touch the underlying
at all.
I think that we need to distinguish between these two cases. I think
the popular rule of thumb that "cursors should be avoided whatever it
takes" not always applies in case 2.
Have a look at this simple script (Table Numbers stored numbers from 1
to 10000):
DROP PROCEDURE CursorTest
go
CREATE PROCEDURE CursorTest
AS
DECLARE @KeepLooping INT, @n INT, @d1 DATETIME, @d2 DATETIME
DECLARE Test1 CURSOR LOCAL FORWARD_ONLY
FOR SELECT Number
FROM dbo.Numbers
ORDER BY Number
SET @KeepLooping = 1
SET @d1 = GETDATE()
OPEN Test1
WHILE @KeepLooping = 1 BEGIN
FETCH Test1 INTO @n
IF @@FETCH_STATUS <> 0 BEGIN
SET @KeepLooping = 0
END
END
SET @d2 = GETDATE()
SELECT 'CursorTest' testname, DATEDIFF(ms, @d1, @d2) duration
CLOSE Test1
DEALLOCATE Test1
go
DROP PROCEDURE LoopTest
go
CREATE PROCEDURE LoopTest
AS
DECLARE @n INT, @d1 DATETIME, @d2 DATETIME
SET @n = -1
SET @d1 = GETDATE()
WHILE @n IS NOT NULL BEGIN
SELECT @n = MIN(Number)
FROM dbo.Numbers WHERE Number > @n
END
SET @d2 = GETDATE()
SELECT 'LoopTest' testname, DATEDIFF(ms, @d1, @d2) duration
go
EXEC LoopTest
EXEC CursorTest
EXEC LoopTest
EXEC CursorTest
EXEC LoopTest
EXEC CursorTest
testname duration
-------- -----------
LoopTest 30000
(1 row(s) affected)
testname duration
---------- -----------
CursorTest 106
(1 row(s) affected)
testname duration
-------- -----------
LoopTest 32190
(1 row(s) affected)
testname duration
---------- -----------
CursorTest 106
(1 row(s) affected)
testname duration
-------- -----------
LoopTest 29736
(1 row(s) affected)
testname duration
---------- -----------
CursorTest 106
(1 row(s) affected)
5 Comments:
Scott, set based solutions usually perform much better. In 2005 there is even less need to use cursors because some OLAP functions are finally available. Yet in some cases cursors are usefuls. More to the point, in some cases non-cursor solutions perform much worse. I did not need to craft an example, I saw that approach with a WHILE loop perform extremely slowly in practice. And yes, as the number of iterations increases, the WHILE loop performs worse and worse, at least to a certain point.
First, I agree... if a set based solution can be achieved without involving a triangular join, the set based solution will win both the performance race and the resource usage race.
Second, I'm not sure what type of machine Alex ran his code on nor what the condition of his numbers table actually is, but when I created a 100,000 row numbers table with a Clustered PK on the Numbers column, as it should be, the While loop beat the pants off the cursor so far as simple duration goes.
Here's the results...
testname duration
-------- -----------
LoopTest 2640
(1 row(s) affected)
testname duration
---------- -----------
CursorTest 3173
(1 row(s) affected)
testname duration
-------- -----------
LoopTest 2436
(1 row(s) affected)
testname duration
---------- -----------
CursorTest 3360
(1 row(s) affected)
testname duration
-------- -----------
LoopTest 2623
(1 row(s) affected)
testname duration
---------- -----------
CursorTest 3236
(1 row(s) affected)
I agree with Jeff: there was definitely something wrong with my Numbers table. When I built it from script and reran the benchmark my results were consistent with Jeff's. However, a slight modification (declaring the cursor as STATIC ) sped up the cursor solution to the first place:
CREATE PROCEDURE CursorTest
AS
DECLARE @KeepLooping INT, @n INT, @d1 DATETIME, @d2 DATETIME
DECLARE Test1 CURSOR STATIC LOCAL FORWARD_ONLY
FOR SELECT Number
FROM dbo.Numbers
ORDER BY Number
(snip)
testname duration
-------- -----------
LoopTest 783
(1 row(s) affected)
testname duration
---------- -----------
CursorTest 466
(1 row(s) affected)
testname duration
-------- -----------
LoopTest 750
(1 row(s) affected)
testname duration
---------- -----------
CursorTest 470
(1 row(s) affected)
testname duration
-------- -----------
LoopTest 763
(1 row(s) affected)
testname duration
---------- -----------
CursorTest 456
(1 row(s) affected)
Thanks for the feedback, Jeff!
Hi Alex,
In SQL 2005, if I have no set-based alternative to cursors, I would use a .net solution (i.e. CLR integration). I haven't tested it but I assume it would run faster than a cursor (when the cursor actually DOES something in the iteration), because procedural languages are more fit to do manipulations of this kind.
I don't think that your test results will make me change my mind regarding cursors because you replaced a cursor with an unuseful set operation. I would normally replace the cursor with a single query and an insert/update/delete based on it, rather than a query in a loop, which would definitely be bad as a cursor is.
Regards,
S. Neumann
Yes well this post was written in respoce to the following article:
"How to Perform SQL Server Row-by-Row Operations Without Cursors"
http://www.sql-server-performance.com/dp_no_cursors.asp
which was referrred in the foillowing thread:
http://groups.google.com/group/microsoft.public.sqlserver.programming/browse_frm/thread/80451379077d62f6/ff25f5b9ad785e99?lnk=gst&q=CursorTest+Kuznetsov&rnum=1#ff25f5b9ad785e99
The while loop as described in that article does not outperform a properly written cursor. So I objected to the WHILE loop being described as a better alternative than a cursor. I was not comparing cursors to other set based solutions.
Post a Comment
<< Home