Thursday, November 17, 2011

Insert 1 Million Rows - How to improve performance

One day one of my colleagues asked me about performance issue in Stored Procedure. She wanted to insert 1 million rows into a table by using Stored Procedure. The problem is the speed of insertion is slower than what she expected.
So here is a TSQL sample that inserts 1 million rows. To simplify the example, I just used TSQL statements without using stored procedure.

DECLARE @i int
DECLARE @max int
DECLARE @start datetime
DECLARE @end datetime

SET @max = 1000000
SET @i = 1
SELECT @start = GETDATE()

WHILE @i < @max
BEGIN
   INSERT MyTable VALUES (@i, N'ABC', N'Seattle',N'98144', N'Desc',101)
   SET @i=@i+1
END

SELECT @end = GETDATE()
SELECT CONVERT(varchar, @end - @start, 108)

When I ran this script on my machine, it took about 2 minutes. So how can we improve its performance?
I added BEGIN TRAN - COMMIT TRAN statement before and after the while loop.
BEGIN TRAN
WHILE @i < @max
BEGIN
   INSERT MyTable VALUES (@i, N'ABC', N'Seattle',N'98144', N'Desc',101)
   SET @i=@i+1
END
COMMIT TRAN


Now, the run took about 18 seconds. It is pretty big difference...
So the reason is the first case creates transaction context for each INSERT while the 2nd case only creates transaction once and use it until the end. When insertion count is low, it is not a big deal, but for large insertion it actually matters. A lot.