Insert exec

I was thinking about difference between INSERT and INSERT-EXEC statement from performance perspective since I saw lots of post announcing that you should avoid the second one mentioned (INSERT-EXEC). I decided to make few tests and look at this problematic from more angles to have better imagination about how these commands behave.

Testing scenarios:

  • Execution plan
  • IO stats and time stats
  • Extended events – statement completed event – row count
  • Transaction log behaviour
  • Transaction scope

Let’s create data structures for our tests:

First create tables for data insert

 CREATE TABLE dbo.InsertTable1 (id INT , [data] VARCHAR(255))
CREATE TABLE dbo.InsertTable2 (id INT , [data] VARCHAR(255))

Create two persistent temporary tables we will fill inside procedures we create:

CREATE TABLE dbo.TempTable1 (id INT , [data] VARCHAR(255))
CREATE TABLE dbo.TempTable2 (id INT, [data] VARCHAR(255))

Create objects we use to test:

The first one is stored procedure inserting data to InsertTable1 table – insert statement is part of stored procedure definition.

 CREATE PROCEDURE dbo.InsertData
   AS
INSERT INTO dbo.TempTable1
SELECT TOP 100000 a.object_id,REPLICATE('a',10 ) a
  FROM sys.objects a
  JOIN sys.objects b ON 1=1

 INSERT INTO dbo.TempTable2
 SELECT TOP 100000 a.object_id,REPLICATE('a',10 ) a
  FROM sys.objects a
  JOIN sys.objects b ON 1=1 

  INSERT INTO dbo.InsertTable1
  SELECT * FROM dbo.TempTable1
  UNION ALL
  SELECT * FROM dbo.TempTable2

The second one batch inserts data to InsertTable2 table – insert is realized by INSERT – EXEC statement. 

EXEC ('INSERT INTO dbo.TempTable1 SELECT TOP 100000 a.object_id,REPLICATE('a',10 ) a 
FROM sys.objects a
JOIN sys.objects b ON 1=1

INSERT INTO dbo.TempTable2

SELECT TOP 100000  a.object_id,REPLICATE('a',10 ) a
FROM sys.objects a
JOIN sys.objects b ON 1=1

INSERT INTO dbo.InsertTable1
SELECT * FROM dbo.TempTable1
UNION ALL
SELECT * FROM dbo.TempTable2
')

Execute satements to fill the data:

exec  dbo.InsertData

INSERT EXEC ('…') /*put exec part above*/

Cleaning statements – call it before statements executions

TRUNCATE TABLE dbo.InsertTable1
TRUNCATE TABLE dbo.InsertTable2
TRUNCATE TABLE dbo.TempTable1
TRUNCATE TABLE dbo.TempTable2
DBCC FREEPROCCACHE
DBCC DROPCLEANBUFFERS
CHECKPOINT

Switch IO stats/processing time /actual execution plan in studio on. Or put following commands:

SET STATISTICS IO ON
SET SHOWPLAN_XML ON
SET STATISTICS TIME ON

Run following extended event with result to file.

CREATE EVENT SESSION GetRowsCount ON SERVER
ADD EVENT transaction_log 
( ACTION (sqlserver.sql_text, sqlserver.tsql_stack, sqlserver.database_id,sqlserver.username)
)
,
ADD EVENT sql_transaction 
( ACTION (sqlserver.sql_text, sqlserver.tsql_stack, sqlserver.database_id,sqlserver.username)
)
,
ADD EVENT sql_batch_completed 
( ACTION (sqlserver.sql_text, sqlserver.tsql_stack, sqlserver.database_id,sqlserver.username)
)
ADD TARGET package0.event_file(
SET filename='C:\outputfile\outputfile2.xel')
ALTER EVENT SESSION GetRowsCount ON SERVER STATE = START
ALTER EVENT SESSION GetRowsCount ON SERVER STATE = STOP

DROP EVENT SESSION GetRowsCount ON SERVER

Statement completed event – counters

Queries touched 400000 rows. 100000 for each from the two temptables and 200000 for final insert. But in case of insert we can see from batch completed event that query touched 600000 instead of 400000. Look at table below to check other counters.

SELECT CAST (EVENT_data AS XML),
       CAST (EVENT_data AS XML).value('(/event/data[@name="batch_text"])[1]','NVARCHAR(100)')     AS batch_text,
       CAST (EVENT_data AS XML).value('(/event/data[@name="duration"])[1]','NVARCHAR(100)')       AS duration,
       CAST (EVENT_data AS XML).value('(/event/data[@name="physical_reads"])[1]','NVARCHAR(100)') AS physical_reads,
       CAST (EVENT_data AS XML).value('(/event/data[@name="logical_reads"])[1]','NVARCHAR(100)') AS logical_reads,
     CASt
(EVENT_data AS
XML).value('(/event/data[@name="row_count"])[1]','NVARCHAR(100)')
AS row_count
FROM sys.fn_xe_file_target_read_file('C:\outputfile\outputfile1*.xel',
'C:\outputfile\outputfile1*.xem',
null,null) 
WHERE object_name = 'sql_batch_completed' ;

 

Counter Insert Insert – exec
Writes 1350 2194
Duration 9149345 11984022
Row count 400000 600000
Logical Reads 471973 962908
Physical Rads 662 1238

As you can see INSERT – EXEC statements consume more resources than normal insert on same set of data.

How it is possible than INSERT – EXEC generates additional 200000 row counts and finally touched 600000 rows and we made inserts with 400000 rows at total?

IO stats

Let’s check IO stats and execution plan to see the difference.

Worktable in IO stats
Worktable in IO stats

 

On picture above you can see that with INSERT-EXEC statement Worktable is created, means that insert-exec uses tempdb to store result- set before final insert. So, there we see that is an additional impact on tempdb and tempdb transaction log too.

Execution plan

Execution plan - INSERT EXEC
Execution plan – INSERT EXEC

 

The difference in execution plan is as follow. With INSERT – EXEC you get two statements. The one for the result-set of EXEC and the second one for inserting the data.

Execution plan INSERT
Execution plan INSERT

Transaction log

Another perspective is the transaction scope behaviour. Let’s get data from transaction_log event to temporary tables first.

SELECT     CAST (EVENT_data AS XML).value('(/event/data[@name="log_record_size"])[1]','INT') logsize,
CAST (EVENT_data AS XML).value('(/event/@timestamp)[1]','datetime2') timestamp,
CAST (EVENT_data AS XML).value('(/event/data[@name="transaction_start_time"])[1]','datetime2') date_time,
CAST (EVENT_data AS XML).value('(/event/data[@name="database_id"])[1]','INT')    database_id,
CAST (EVENT_data AS XML).value('(/event/data[@name="transaction_id"])[1]','INT')    transaction_id,
CAST (EVENT_data AS XML).value('(/event/action[@name="sql_text"])[1]','VARCHAR(1000)')    sql_text ,
CAST (EVENT_data AS XML).value('(/event/data[@name="operation"])[1]','VARCHAR(1000)')    operation
INTO #t1
FROM sys.fn_xe_file_target_read_file('C:\outputfile\outputfile1*.xel', 'C:\outputfile\outputfile1*.xem', null, null)
WHERE object_name = 'transaction_log' ;

SELECT CAST (EVENT_data AS XML).value('(/event/data[@name="log_record_size"])[1]','INT') logsize,
CAST (EVENT_data AS XML).value('(/event/@timestamp)[1]','datetime2') timestamp,
CAST (EVENT_data AS XML).value('(/event/data[@name="transaction_start_time"])[1]','datetime2') date_time,
CAST (EVENT_data AS XML).value('(/event/data[@name="database_id"])[1]','INT')    database_id,
CAST (EVENT_data AS XML).value('(/event/data[@name="transaction_id"])[1]','INT')    transaction_id,
CAST (EVENT_data AS XML).value('(/event/action[@name="sql_text"])[1]','VARCHAR(1000)')    sql_text ,
CAST (EVENT_data AS XML).value('(/event/data[@name="operation"])[1]','VARCHAR(1000)')    operation
INTO #t2
FROM sys.fn_xe_file_target_read_file('C:\outputfile\outputfile2*.xel', 'C:\outputfile\outputfile2*.xem', null, null)
WHERE object_name = 'transaction_log' 
;

 Comparing following query outputs, we can see that insert-exec is scoped by one transaction against multiple individual transactions with normal insert.

SELECT COUNT(1) row_count,SUM(logsize) logsize,transaction_id,database_id FROM #t1 WHERE operation = '2LOP_INSERT_ROWS' GROUP BY transaction_id ,database_id

SELECT COUNT(1) row_count,SUM(logsize) logsize,transaction_id,database_id FROM #t2 WHERE operation = '2LOP_INSERT_ROWS' GROUP BY transaction_id, database_id
Transaction log output
Transaction log output

Transaction scope

In case of INSERT-EXEC statement it should rollback all insert statements inside of EXEC statement when error occurs, because INSERT – EXEC is scoped by one transaction. In case of individual transactions in stored procedure, each insert is taken like separate transaction, so rollbacked will be insert resulting with error. Let’s try:

Change type of inserting value to INT column in second insert.

INSERT INTO dbo.TempTable2
SELECT TOP 100000 REPLICATE('a',10 ) REPLICATE('a',10 ) a
FROM sys.objects a
JOIN sys.objects b ON 1=1

Run testing queries again. As you can see in case of INSERT-EXEC statement there are no rows inserted in tables since rollback appears.

Conclusion:

While INSERT-EXEC statement still takes place in some scenarios, you should be aware of mentioned circumstances.

NOTICE: I would like to check that transaction log of temporary database was filled with connection to the worktable created by INSERT-EXEC statement. But I cannot see any insert lop insert operation trough extended events in temporary database. I just see extent allocation

SELECT * FROM #t2 WHERE database_id = 2

 If you have any idea whether worktables are logged and it is possible to trace them, please write a comment.

 

Transaction log – multi-statement table valued function

Someday I got interesting question in my job. Does SELECT query from multi-statement table valued function have impact on transaction log?

Because of result-set, there is a table variable declaration in multi-statement table valued function, we could expect the same behavior as described in post.

It would be interesting to make some tests, because I didn’t think about that simple select from function could affect transaction log of tempdb database. Results were surprising for me, follow reading this post, if you are insterested in. Just to remind, run bellow published scripts on testing environment only. I am using local installation of SQL on my laptop.

I would use similar approach as in post, In short.

  1. Clean up tempdb transaction log, and set the size of the log to the minimum
  2. Create multivalued function and select data from it
  3. Run simple select from the function
  4. In another query window run undocumented function sys.fn_dblog to get data from transaction log
  5. In another query window run allocation unit query extended to locking info (I will explain later)
  6. Check how tempdb transaction log file grew up

Firstly, set tempdb transaction log at the minimum size, to see if there is an impact on the transaction log. And call CHECKPOINT operation to clean-up transaction log. With the query bellow check list of allocation units with allocated pages in tempdb.

select request_type, request_mode,sa.total_pages,so.name, so.object_id, sp.index_id, sp.partition_id, sp.hobt_id, sa.allocation_unit_id, sa.type_desc
, l.*
FROM sys.objects so
JOIN sys.partitions sp on so.object_id = sp.object_id
JOIN sys.allocation_units sa on sa.container_id = sp.hobt_id
LEFT JOIN sys.dm_tran_locks l ON so.object_id =l.resource_associated_entity_id
WHERE total_pages >0

In my environment the query did not return any temporary tables with allocated pages.  So we can start test.

Run script bellow to create testing function.

CREATE FUNCTION [dbo].[fn_GetData]()
RETURNS
@TableTest TABLE (id INT, testdata INT
)
AS
BEGIN
 INSERT INTO @TableTest
SELECT a.object_id
     , a.object_id
FROM sys.objects a
JOIN sys.objects b ON 1=1
JOIN sys.objects c ON 1=1
JOIN sys.objects d ON 1=1
JOIN sys.objects e ON 1=1
RETURN
END
GO

Query to get data from created function.

SELECT * FROM dbo.fn_GetData() ;

In another SQL query window run query returning allocation units, listed above. After querying allocation units from tempdb we can see that there is  temporary table with allocated pages. Run query few times to see that count of allocated pages is increasing. See picture bellow.

Allocation units
Picture 01 Allocation units

Lets check tempdb transaction log file size. We can see that it was increased rapidly.

Transaction log file size
Picture 02 Transaction log file size

It is interesting that even when using simple SELECT from multi-statement table valued function, the tempdb transaction log can be affected. The Table variable declared for result-set inside the function is physically placed in tempdb. It is very similar behavior as in the post.

From sys.fn_dblog function we can see operations on Allocation unit (temporary table) we got from queries above.

Transaction log
Picture03 Transaction log

This time I had problem with DBCC PAGE to check inserted data. I was not able to access data as in post. That was reason why I extended script querying allocation units to locking info, where you can see that temporary table has exclusive lock, so other processes cannot read data from it. In my previous post where I tested impact of  Table variable on transaction log, there was BU (bulk load) lock mode set, so I was able to access data from sys.fn_dblog function.

Conclusion. Not only Table variable could impact transaction log, but multi-statement table valued function can affect it too. It is very interesting, and here raises another reason why you should be careful using these SQL features with respect to query performance. Both object types are physically created in tempdb database, not in memory. I found one difference between the table valued function and Table variable and it is lock escalation on these objects. While in case of Table variable, there was created temporary table in tempdb database with BU request mode, temporary table created by querying the multi-statement function was locked by X (exclusive lock). There would be probably more differrencies but maybe next time, and in another post.


Table variable myths

There are lots of myths regarding Table variable . You can find lots of theories that Table variable has no impact on transaction log since it is out of scope of transaction. You can find   lots of articles that Table variable is stored in memory too. I decided to do some tests to see if Table variable could have impact on transaction log and if it is physically created in tempdb .

Important note at the beginning. Try bellow mentioned queries on your test environment only! Do not run it on production. I used SQL server 2017 installed on my laptop locally. It is better that I could eliminate possible impact of other processes running on SQL Server.

In my sample I created simple Table variable filling with lots of data in while cycle. In another query window I will use undocumented sys.fn_dblog function to read what is happening in transaction log.

I sized transaction log file of temporary table at very low value – 4MB. We will see if insert to Table variable can increase its file size.

Log size
Picture 01 Log size

Let’s clean up transaction log of temporary table first. Look at script where sys.fn_dblog function is called, to see how the transaction log looks like. There are only three records returned.

CHECKPOINT 
SELECT  * FROM sys.fn_dblog(NULL, NULL)

empty tempdb transaction log
Piture 02 empty tempdb transaction log

Execute script with insert rows to Table variable and do it in neverending while cycle, like I do. The main issue was to have query still running while getting data from log. Since the query was stopped, I was not able to get needed data from the log function. So it is important that the query with table variable you would like to analyze will be still running while getting data from transaction log in another query window.

SELECT [Transaction Name], [spid], [Xact ID], f.[Page ID], f.[parent transaction id], [Transaction ID],[Transaction Name] ,AllocUnitName,*
FROM sys.fn_dblog(NULL,NULL) f WHERE [SPID]=56

Get SPID of inserting query and run above mentioned query in new query window with SPID you get.  In output you can see name of Temporary table we created as name of transaction. We are lookingn for Transaction Name wiht AllocPages.

 Get log data based on parent transaction
Picture 04 Get log data based on parent transaction

Take Parent Transaction ID from the row where column Transaction Name = AllocPages and change predicate to select records base on Transaction ID

SELECT [Transaction Name], [spid], [Xact ID], f.[Page ID], f.[parent transaction id], [Transaction ID],[Transaction Name] ,AllocUnitName,* FROM sys.fn_dblog(NULL,NULL) f WHERE [Transaction ID]='0000:00002622'

Here we can see temporary table name in Allocation unit name column. The name of Allocation unit uses the same convention like for local temporary table. It really seems that  Table variable is physically created in tempdb as Temporary table.

Result from log filtered by transaction ID
Picture 05 Result from log filtered by transaction ID

Another view could be made with Allocation units, where you can check how many pages were used in our transaction in Table variable.

SELECT so.name, so.object_id, sp.index_id, sp.partition_id, sp.hobt_id, sa.allocation_unit_id, sa.type_desc, sa.total_pages
FROM sys.objects so
JOIN sys.partitions sp on so.object_id = sp.object_id
JOIN sys.allocation_units sa on sa.container_id = sp.hobt_id

Get data from allocation units and partitions
Picture 06 Get data from allocation units and partitions

Now we verify that the temporary table is connected to our Table variable and if so, that the data are stored in tempdb. We check that data inserting to Table variable can be found in pages of tempdb data files. I took first Page ID from row with Operation of LOP_MODIFY_ROW type.  It is highlighted on picture 05 – 0005:0000bd90. First number 0005 corresponds to tempdb file ID, the second number converted from hex to dec 48528 is page ID.  Use DBCC command  bellow to get SGAM page to get info where pages with data are placed.

DBCC PAGE(2,5, 48528 , 3) WITH TABLERESULTS;

We put 2 as first parameter meaning tempdb database ID. The second  parameter 5 is database file ID, 48528 number of page, and last parameter 3 output style.  Bellow we get list of ranges where pages are allocated.

List allocated pages
Picture 07 List allocated pages

Let’s choose one page from above listed allocated pages range- highlighted. I choose page 48537 using DBCC command again.

DBCC PAGE(2,5, 48537 , 3) WITH TABLERESULTS;

Look at details from DBCC output bellow. At Field column we can find id, testdata column name defined in our Table variable. In VALUES column get data already inserted.

page detail
Picture 08 page detail

We verified that Temporary table dbo.#BC836344 is Table variable we declared in our testing queryNow look at transaction log size. We see that tempdb log file size  increased.

Transaction log size
Picture 09 Transaction log size

When stopped the query inserting data to table variable we can see that the temporary table  disappeared from Transaction log.

Transaction log
Picture 10 Transaction log

Finally we checked that

  • Table variable is actually temporary table created in  tempdb , persisted during query run.
  • we could get inserted data by accessing pages from tempdb
  • DML operation on Table variable have impact on transaction log of tempdb
  • It can even cause unexpected increase of transaction log size

What is not still clear to me, or maybe I dont see it, why it is implemented this way. Table variable is defined like out of transaction scope table by Microsoft. Why there is a need to write data to transaction log, it seems useless to me. 

With this post I proved that Table variable is actually Temporary table, created in tempdb with some specific behavior. Next time could be insteresting to compare above mentioned sample with local Temporary table to see the differences in transaction log and pages allocation. Stay tuned.