We have all been taught the evils of using cursors, but do cursors have a place in our T-SQL arsenal?
Are cursors always a bottleneck and is it simply better to replace a cursor with a while loop? Clearly, the performance problems are because the code is not doing things in a set based manner, but rather by processing row by agonizing row (RBAR) one at a time using a dreaded cursor. Of course this approach is not the way to go when writing SQL code; this cursor should eliminated and replaced with a cleaner, more efficient (and more socially acceptable!) solution.
So this begs the question “How we do optimize cursor operations?” The commonly suggested approach is to eliminate the CURSOR by replacing it with a WHILE loop. Therefor instead of declaring a CURSOR to loop through the table, we now are using “set-based” code and our problem solved, and obviously it will perform much better. The cursor is gone, the code looks much cleaner, its been tested it and it works properly, so off to production it goes. Another cursor has been busted!, RIght?
Wrong, In reality this does not guarantee success, or even vastly improved performance. Eliminating cursors is not about syntax and replacing cursors with WHILE loops that do the same thing. Its about optimizing your process. Optimizing and\or replacing cursors involves much more then a simple find and replace Optimization involves analyzing the code and understanding exactly what is happening when each row is processed. You need to find out what the “DoSomething” is actually doing.
Suppose the DoSomething procedure is generating a report and sending an email to the “Product Manager” for each product that contains status information, and then logging this email message into a table somewhere. In this scenario, what is gained by replacing the CURSOR with a While Loop?
NOT MUCH! This is due to the fact that the task at hand, we may very well need to process rows in the Product table one-by-one to send our emails and generate the report, and the bottleneck here is not the cursor code at all, but rather the report generation and maybe sending the email. Eliminating the cursor code probably gains us nothing here. If you need to process rows one at a time, go ahead and use a cursor — that’s what they are there for! Replacing a perfectly fine, simple cursor with a WHILE loop might even make your code longer, or more confusing, or even less efficient depending on circumstances.
For example, what if we need to process the Products ordered by Region, then Product Name, for whatever reason. Our cursor code is simple:
set @Products = cursor for
order by Region, ProductName
All that we needed to change was our ORDER BY clause. Now, how would we write this as a WHILE loop? Is it possible? Sure. Will it be as simple and clean as using a cursor? No, it won’t. (Though ROW_COUNT() makes this much easier than it used to be)
Now, I am not here to say that cursors are “good”, but if you really need to process rows one by one, go ahead and proudly use a cursor. Replacing cursors isn’t about processing rows one-by-one in a different way (i.e., using a WHILE loop instead), it is about not processing rows one-by-one at all!
Let’s consider another scenario: What if the DoSomething stored procedure is checking to see if the Product’s ExpireDate is greater than today’s date, and if so, it is updates the Status column for that Product to ‘X’.
In that situation, what have we gained by rewriting ProcessProducts without a cursor, and using a WHILE loop instead? The answer is, once again: nothing! In fact, we potentially have once again made our code more confusing or even less efficient than a cursor might be! Remember, the bottleneck isn’t the cursor syntax — it is the fact that we are processing rows one at a time. Replacing the cursor with the WHILE loop didn’t solve this problem, did it?
So, looking now at both of the scenarios I presented for the DoSomething stored procedure, it should be clear that we did not fix anything by replacing the cursor in either case simply by writing a WHILE loop. If that’s all you are doing, don’t bother replacing the cursor at all. You haven’t optimized anything.
As I said before, the art of replacing a cursor is not a find-and-replace syntax change operation — it is a fundamental change in how you process your data. As in the Product report generation and email example, it may be that we simply need to process rows one by one, and thus no further optimization is possible from a SQL point of view. In situations like updating the Product table, however, we do not need to process the rows individually — we can do everything in one single UPDATE statement. Thus, in order to determine how to optimize the ProcessProducts stored procedure, we needed to dig deeper into entire process as a whole, which included examining the DoSomething stored procedure and determining the full scope of exactly what this “ProcessProducts” stored procedure is doing.
So, if “DoSomething” is updating the Products table as specified, we now know that a good replacement for our cursor code doesn’t result in a WHILE loop and calling a separate stored procedure over and over at all — it results a true, set-based solution:
create procedure ProcessProducts
Update Products set Status=’X’ where ExpireDate > getdate()
And THAT is how you optimize a cursor! No loops, no calling of another stored procedure for each row in a table, no “find-and-replace” cursor code removal. You need to examined the entire process, and optimize the entire process, to get it done quicker and shorter and faster without cursors or loops.
Cursor optimization is not about replacing a cursor with a while loop and rewriting syntax, it is about redesigning your algorithm.