Tuesday, September 19, 2006

When Eliminataing Cursors Hurts Performance

Not every non-cursor so9lution is performant. I think there are at least two different situations:

1. You open a cursor and use it to modify the underlying table.
2. You open a cursor, do something, but you do not touch the underlying
at all.


I think that we need to distinguish between these two cases. I think
the popular rule of thumb that "cursors should be avoided whatever it
takes" not always applies in case 2.
Have a look at this simple script (Table Numbers stored numbers from 1
to 10000):


DROP PROCEDURE CursorTest
go
CREATE PROCEDURE CursorTest
AS
DECLARE @KeepLooping INT, @n INT, @d1 DATETIME, @d2 DATETIME
DECLARE Test1 CURSOR LOCAL FORWARD_ONLY
FOR SELECT Number
FROM dbo.Numbers
ORDER BY Number


SET @KeepLooping = 1


SET @d1 = GETDATE()


OPEN Test1


WHILE @KeepLooping = 1 BEGIN
FETCH Test1 INTO @n
IF @@FETCH_STATUS <> 0 BEGIN
SET @KeepLooping = 0
END
END


SET @d2 = GETDATE()
SELECT 'CursorTest' testname, DATEDIFF(ms, @d1, @d2) duration


CLOSE Test1
DEALLOCATE Test1
go
DROP PROCEDURE LoopTest
go
CREATE PROCEDURE LoopTest
AS
DECLARE @n INT, @d1 DATETIME, @d2 DATETIME


SET @n = -1
SET @d1 = GETDATE()


WHILE @n IS NOT NULL BEGIN
SELECT @n = MIN(Number)
FROM dbo.Numbers WHERE Number > @n
END
SET @d2 = GETDATE()
SELECT 'LoopTest' testname, DATEDIFF(ms, @d1, @d2) duration
go
EXEC LoopTest
EXEC CursorTest
EXEC LoopTest
EXEC CursorTest
EXEC LoopTest
EXEC CursorTest


testname duration
-------- -----------
LoopTest 30000


(1 row(s) affected)


testname duration
---------- -----------
CursorTest 106


(1 row(s) affected)


testname duration
-------- -----------
LoopTest 32190


(1 row(s) affected)


testname duration
---------- -----------
CursorTest 106


(1 row(s) affected)


testname duration
-------- -----------
LoopTest 29736


(1 row(s) affected)


testname duration
---------- -----------
CursorTest 106


(1 row(s) affected)

5 Comments:

At 8:19 PM, Blogger Alex Kuznetsov said...

Scott, set based solutions usually perform much better. In 2005 there is even less need to use cursors because some OLAP functions are finally available. Yet in some cases cursors are usefuls. More to the point, in some cases non-cursor solutions perform much worse. I did not need to craft an example, I saw that approach with a WHILE loop perform extremely slowly in practice. And yes, as the number of iterations increases, the WHILE loop performs worse and worse, at least to a certain point.

 
At 7:38 PM, Anonymous Anonymous said...

First, I agree... if a set based solution can be achieved without involving a triangular join, the set based solution will win both the performance race and the resource usage race.

Second, I'm not sure what type of machine Alex ran his code on nor what the condition of his numbers table actually is, but when I created a 100,000 row numbers table with a Clustered PK on the Numbers column, as it should be, the While loop beat the pants off the cursor so far as simple duration goes.

Here's the results...
testname duration
-------- -----------
LoopTest 2640

(1 row(s) affected)

testname duration
---------- -----------
CursorTest 3173

(1 row(s) affected)

testname duration
-------- -----------
LoopTest 2436

(1 row(s) affected)

testname duration
---------- -----------
CursorTest 3360

(1 row(s) affected)

testname duration
-------- -----------
LoopTest 2623

(1 row(s) affected)

testname duration
---------- -----------
CursorTest 3236

(1 row(s) affected)

 
At 6:12 AM, Blogger Alex Kuznetsov said...

I agree with Jeff: there was definitely something wrong with my Numbers table. When I built it from script and reran the benchmark my results were consistent with Jeff's. However, a slight modification (declaring the cursor as STATIC ) sped up the cursor solution to the first place:

CREATE PROCEDURE CursorTest
AS
DECLARE @KeepLooping INT, @n INT, @d1 DATETIME, @d2 DATETIME
DECLARE Test1 CURSOR STATIC LOCAL FORWARD_ONLY
FOR SELECT Number
FROM dbo.Numbers
ORDER BY Number

(snip)

testname duration
-------- -----------
LoopTest 783

(1 row(s) affected)

testname duration
---------- -----------
CursorTest 466

(1 row(s) affected)

testname duration
-------- -----------
LoopTest 750

(1 row(s) affected)

testname duration
---------- -----------
CursorTest 470

(1 row(s) affected)

testname duration
-------- -----------
LoopTest 763

(1 row(s) affected)

testname duration
---------- -----------
CursorTest 456

(1 row(s) affected)

Thanks for the feedback, Jeff!

 
At 2:16 AM, Anonymous Anonymous said...

Hi Alex,

In SQL 2005, if I have no set-based alternative to cursors, I would use a .net solution (i.e. CLR integration). I haven't tested it but I assume it would run faster than a cursor (when the cursor actually DOES something in the iteration), because procedural languages are more fit to do manipulations of this kind.

I don't think that your test results will make me change my mind regarding cursors because you replaced a cursor with an unuseful set operation. I would normally replace the cursor with a single query and an insert/update/delete based on it, rather than a query in a loop, which would definitely be bad as a cursor is.

Regards,
S. Neumann

 
At 5:48 PM, Blogger Alex Kuznetsov said...

Yes well this post was written in respoce to the following article:
"How to Perform SQL Server Row-by-Row Operations Without Cursors"

http://www.sql-server-performance.com/dp_no_cursors.asp

which was referrred in the foillowing thread:

http://groups.google.com/group/microsoft.public.sqlserver.programming/browse_frm/thread/80451379077d62f6/ff25f5b9ad785e99?lnk=gst&q=CursorTest+Kuznetsov&rnum=1#ff25f5b9ad785e99

The while loop as described in that article does not outperform a properly written cursor. So I objected to the WHILE loop being described as a better alternative than a cursor. I was not comparing cursors to other set based solutions.

 

Post a Comment

<< Home