Query compilation and optimization consumes a lot of resources, especially hard hit is CPU. Obviously these costs go up with the complexity of the query. Depending on design even a simple query can consume 20 milliseconds of CPU time. You can see how this can quickly add up and effect overall system performance. This is where query plans come in and SQL’s ability to cache compiled query plans for reuse. On a new query the query plan is placed in procedure cache. The process is fairly straight forward, the incoming query is checked against the plan cache, if found the plan is reused, if not found the query is parsed and compiled and the resultant query plan is stored in cache. The more query plans can be reused the better saving the CPU the overhead of having to parse and compile the query. Reusing query plans is extremely important on high volume OLTP systems where the queries are relatively small and static. In such cases the query compilation can exceed that actual cost of executing the query itself. For OLTP environments often the plan reuse is based on keys and not necessarily on the exact value or values used as predicates in the query. When it come to a OLAP or data-warehousing plan reuse usually depends upon predicates in the query. Many times the execution is much more costly then the actual compilation so workload may benefit greatly from using ad hoc SQL and letting the query optimizer search for the optimal plan for each set of values. The optimal query plan is more likely to change depending on the predicates in the query. Using parameterized queries or stored procedures for OLTP based applications substantially increases the chance of reusing a cached plan. Parameterization can be enabled at the database or query level via the “PARAMETERIZATION FORCED” database option or query hint.
How to find query compilations:
Of course the is an “App for that” actually it’s a DMV. When a query is compiled it is given an ID or “signature” represented as the query_hash column in the sys.dm_exec_requests and sys.dm_exec_query_stats DMV’s. If you are looking at the showplan XML it is represented by the QueryHash tag. Queries that only vary by their literal values will have the same query hash. For example the first and second queries would generate the same query hash :
- SELECT * FROM sys.databases WHERE database_id = 1
- SELECT * FROM sys.databases WHERE database_id = 12
- SELECT * FROM sys.databases WHERE name = ‘master’
The query hash is computed based on the tree structure of the query produced during the compilation process. The query hash ignores whitespaces, any differences in explicit columns VS * in the select list, and fully qualified object names VS not fully qualified. An easy way to detect applications that submit lots of ad hoc queries is by grouping on the sys.dm_exec_query_stats.query_hash column as follows:
- select q.query_hash,
- t.text as sample_query,
- p.query_plan as sample_plan
- from (select top 20 query_hash,
- count(*) as number_of_entries,
- min(sql_handle) as sample_sql_handle,
- min(plan_handle) as sample_plan_handle
- from sys.dm_exec_query_stats
- group by query_hash
- having count(*) > 1
- order by count(*) desc) as q
- cross apply sys.dm_exec_sql_text(q.sample_sql_handle) as t
- cross apply sys.dm_exec_query_plan(q.sample_plan_handle) as p
If you see number_of_entries in the hundreds or thousands this query is a candidate for parameterization. If you take the CompileTime and CompileCPU fields from the sample_plan and multiply them against the number_of_entries the result will be a rough estimate of the amount of time saved by parameterization.
When a query is submitted to SQL Server, the server checks for the validity and correctness of the query plan before it begins executing. If one of these checks fails, the batch may have to be compiled again to produce a different query plan known as a recompile. Recompilations are generally necessary to ensure correctness and are often performed when the optimizer determines that there could be a more optimal query plan. Compilations by nature are CPU intensive and hence excessive recompilations could result in high CPU utilization on the system.
Recompilation can happen in response to various conditions, such as:
- Schema changes
- Statistics changes
- Deferred compilation
- SET option changes
- Temporary table changes
- Stored procedure creation with the RECOMPILE or OPTION (RECOMPILE) query hint
Recompiles can be detected via the performance monitor looking at the SQL Statistics and looking at the ration of SQL Recompilation/sec to the number of Batch Requests/sec.
These are the key data counters:
- SQL Server: SQL Statistics: Batch Requests/sec
- SQL Server: SQL Statistics: SQL Compilations/sec
- SQL Server: SQL Statistics: SQL Recompilations/sec
How much time does SQL spend optimizing:
Looking at the sys.dm_exec_query_optimizer_info DMV, you can get a good idea of the time SQL Server spends optimizing. If you take two snapshots of this DMV, you can get a good feel for the time that is spent optimizing in the given time period. Looking at the elapsed time due to optimizations.
select * from sys.dm_exec_query_optimizer_info
Another good DMV to look at is sys.dm_exec_query_stats.
The data columns of interest are:
- Total worker time
- Plan generation number
- Statement Start Offset
In particular, plan_generation_num indicates the number of times the query has recompiled. The following query will return the top 25 stored procs that have been recompiled:
select top 25
cross apply sys.dm_exec_sql_text(sql_handle) as sql_text
order by plan_generation_num desc
Avoid changing SET options within stored procedures. It is better to set them at the connection level. Ensure that SET options are not changed during the lifetime of the connection.
- To avoid recompilations that are due to changes in statistics, specify the KEEPFIXED PLAN query hint.
- To avoid recompilations due to temporary table usage you can change the temporary tables to table variables or use the KEEP PLAN query hint.
- Turn off the automatic updates of statistics for indexes and statistics that are defined on a table or indexed view prevents recompilations that are due to statistics changes on that object. Just make sure you replace it with a periodic maintenance job to update stats on a specific interval.
- Use qualified object names i.e. shema.tablename to avoid comilations due to ambiguity between objects.
- Periodically run the DTA (Database Tuning Advisor) to determine any index optimization that may be need, again use with caution and common sense.
- Check stored procedures to see if it was created with the WITH RECOMPILE option or whether the RECOMPILE query hint was used. If so you may be able to take advantage of a statement-level RECOMPILE hint if a particular statement rather then forcing the entire stored procedure.