In one of our project, we had a lot of critical jobs and more over some of them were depending on each other and also had the dependency on to finish the jobs within a specific time otherwise that might be caused of other jobs failure. So, while we started migrating all the jobs from a very old environment to new, we had to mainly focus on the reason of job failures and performance both at the same time.
To troubleshoot SQL Server performance problem, one of the powerful tool to use is Profiler. This tool allows you us to collect number of information that is on SQL Server for analysis and troubleshooting. The problem with using Profiler is that it is a client tool and unless it is run on the server itself the connection may be lost and our trace stops. This usually happens right before the problem we are trying to troubleshoot and don’t end up collecting that valuable information we do need.
One alternative approach we found to use Profiler is to run a Server Side Trace. This process runs on the server and collects trace statistics pretty much the same way that we used to do using Profiler. We implemented this approach and used to share the custom trace information on regular basis with Customer and they are also happy with that level of good information. We can find the relevant implementation in MSDN but I thought of sharing with you and also just wanted to make it handy and crispy. Let see how it can be implemented:
Here is the trace definition we will have to execute and does not need to be run unless the SQL Servers are restarted.
-- Create a Queue
DECLARE @rc INT
DECLARE @TraceID INT
DECLARE @maxfilesize BIGINT
SET @maxfilesize = 250
EXEC @rc = sp_trace_create @TraceID OUTPUT, 2, N'C:Sql TracesTraceV1', @maxfilesize, NULL
IF (@rc != 0) GOTO ERROR
-- Set the events
DECLARE @on BIT
SET @on = 1
EXEC SP_TRACE_SETEVENT @TraceID, 16, 15, @on
EXEC SP_TRACE_SETEVENT @TraceID, 16, 8, @on
EXEC SP_TRACE_SETEVENT @TraceID, 16, 12, @on
EXEC SP_TRACE_SETEVENT @TraceID, 16, 9, @on
EXEC SP_TRACE_SETEVENT @TraceID, 16, 13, @on
EXEC SP_TRACE_SETEVENT @TraceID, 16, 6, @on
EXEC SP_TRACE_SETEVENT @TraceID, 16, 10, @on
EXEC SP_TRACE_SETEVENT @TraceID, 16, 14, @on
EXEC SP_TRACE_SETEVENT @TraceID, 16, 11, @on
EXEC SP_TRACE_SETEVENT @TraceID, 33, 31, @on
EXEC SP_TRACE_SETEVENT @TraceID, 33, 8, @on
EXEC SP_TRACE_SETEVENT @TraceID, 33, 1, @on
EXEC SP_TRACE_SETEVENT @TraceID, 33, 9, @on
EXEC SP_TRACE_SETEVENT @TraceID, 33, 6, @on
EXEC SP_TRACE_SETEVENT @TraceID, 33, 10, @on
EXEC SP_TRACE_SETEVENT @TraceID, 33, 14, @on
EXEC SP_TRACE_SETEVENT @TraceID, 33, 11, @on
EXEC SP_TRACE_SETEVENT @TraceID, 33, 12, @on
EXEC SP_TRACE_SETEVENT @TraceID, 33, 20, @on
EXEC SP_TRACE_SETEVENT @TraceID, 61, 31, @on
EXEC SP_TRACE_SETEVENT @TraceID, 61, 8, @on
EXEC SP_TRACE_SETEVENT @TraceID, 61, 1, @on
EXEC SP_TRACE_SETEVENT @TraceID, 61, 9, @on
EXEC SP_TRACE_SETEVENT @TraceID, 61, 6, @on
EXEC SP_TRACE_SETEVENT @TraceID, 61, 10, @on
EXEC SP_TRACE_SETEVENT @TraceID, 61, 14, @on
EXEC SP_TRACE_SETEVENT @TraceID, 61, 11, @on
EXEC SP_TRACE_SETEVENT @TraceID, 61, 12, @on
EXEC SP_TRACE_SETEVENT @TraceID, 20, 31, @on
EXEC SP_TRACE_SETEVENT @TraceID, 20, 8, @on
EXEC SP_TRACE_SETEVENT @TraceID, 20, 12, @on
EXEC SP_TRACE_SETEVENT @TraceID, 20, 1, @on
EXEC SP_TRACE_SETEVENT @TraceID, 20, 9, @on
EXEC SP_TRACE_SETEVENT @TraceID, 20, 6, @on
EXEC SP_TRACE_SETEVENT @TraceID, 20, 10, @on
EXEC SP_TRACE_SETEVENT @TraceID, 20, 14, @on
EXEC SP_TRACE_SETEVENT @TraceID, 20, 11, @on
EXEC SP_TRACE_SETEVENT @TraceID, 72, 8, @on
EXEC SP_TRACE_SETEVENT @TraceID, 72, 12, @on
EXEC SP_TRACE_SETEVENT @TraceID, 72, 9, @on
EXEC SP_TRACE_SETEVENT @TraceID, 72, 6, @on
EXEC SP_TRACE_SETEVENT @TraceID, 72, 10, @on
EXEC SP_TRACE_SETEVENT @TraceID, 72, 14, @on
EXEC SP_TRACE_SETEVENT @TraceID, 72, 11, @on
EXEC SP_TRACE_SETEVENT @TraceID, 71, 8, @on
EXEC SP_TRACE_SETEVENT @TraceID, 71, 12, @on
EXEC SP_TRACE_SETEVENT @TraceID, 71, 9, @on
EXEC SP_TRACE_SETEVENT @TraceID, 71, 6, @on
EXEC SP_TRACE_SETEVENT @TraceID, 71, 10, @on
EXEC SP_TRACE_SETEVENT @TraceID, 71, 14, @on
EXEC SP_TRACE_SETEVENT @TraceID, 71, 11, @on
-- Set the Filters
DECLARE @intfilter int
DECLARE @bigintfilter bigint
EXEC SP_TRACE_SETFILTER @TraceID, 1, 0, 7, N'Warning: Null value is eliminated by an aggregate or other SET operation.'
EXEC SP_TRACE_SETFILTER @TraceID, 1, 0, 7, N'Changed language setting%'
EXEC SP_TRACE_SETFILTER @TraceID, 1, 0, 7, N'Changed database context%'
EXEC SP_TRACE_SETFILTER @TraceID, 1, 0, 7, N'Login failed for user ''NT AUTHORITYSYSTEM''%'
EXEC SP_TRACE_SETFILTER @TraceID, 1, 0, 1, NULL
-- Set the trace status to start
EXEC sp_trace_setstatus @TraceID, 1
-- display trace id for future references
SELECT TraceID=@TraceID
GOTO finish
ERROR:
SELECT ErrorCode=@rc
FINISH:
GO
sp_trace_create – This procedure helps us to create the trace
sp_trace_setevent – This procedure helps us to specifie what event to capture and what column to capture
· TraceID – the ID of the trace
· EventID – the ID of the event you want to capture
· ColumnID – the ID of the column you want to capture
· On – whether you want to turn this event on or off
Event number | Event name | Description |
16 | Attention | Detects all activity by users connected to SQL Server before the trace started. |
33 | Exception | Indicates that an exception has occurred in SQL Server. |
61 | OLE DB Errors | Indicates that an OLE DB error has occurred. |
20 | Login Failed | Indicates that a login attempt to SQL Server from a client failed. |
72 | Exec Prepared SQL | ODBC, OLE DB, or DB-Library has executed a prepared Transact-SQL statement or statements. |
71 | Prepare SQL | ODBC, OLE DB, or DB-Library has prepared a Transact-SQL statement or statements for use. |
Column number | Column name | Description |
1 | TextData | Text value dependent on the event class that is captured in the trace. |
6 | NTUserName | Microsoft Windows NT® user name. |
8 | ClientHostName | Name of the client computer that originated the request. |
9 | ClientProcessID | ID assigned by the client computer to the process in which the client application is running. |
10 | ApplicationName | Name of the client application that created the connection to an instance of SQL Server. This column is populated with the values passed by the application rather than the displayed name of the program. |
11 | SQLSecurityLoginName | Server Process ID assigned by SQL Server to the process associated with the client. |
12 | SPID | Server Process ID assigned by SQL Server to the process associated with the client. |
13 | Duration | Amount of elapsed time (in milliseconds) taken by the event. This data column is not populated by the Hash Warning event. |
14 | StartTime | Time at which the event ended. This column is not populated for starting event classes, such as SQL:BatchStarting orSP:Starting. It is also not populated by the Hash Warning event. |
15 | EndTime | Time at which the event ended. This column is not populated for starting event classes, such as SQL:BatchStarting orSP:Starting. It is also not populated by the Hash Warning event. |
20 | Severity | Severity level of an exception. |
31 | Error | Error number. |
sp_trace_setfilter – This procedure specifies the filters to set. This determines whether you include or exclude data
· TraceID – the ID of the trace
· ColumnID – the ID of the column you want to set the filter on
· LogicalOperator – specifies whether this is an AND or OR operation
· ComparisonOperator – specify whether the value is equal, greater then, less the, like, etc.
· Value – the value to use for your comparison
sp_trace_setstatus –
· TraceID – the ID of the trace
· Status – stop, start or close a trace
Now I just wanted to view all the errors happened from 3 months back. Here is the SQL statement we can execute to view the error trace
SELECT HostName
, ApplicationName
, LoginName
, EventClass
, TextData
, starttime
FROM ::fn_trace_gettable('C:Sql TracesTraceV1.trc', default)
WHERE
EventClass in (162)
AND ApplicationName != 'Microsoft SQL Server'
AND ApplicationName != 'Microsoft SQL Server Management Studio'
AND ApplicationName != 'Microsoft ® Windows Script Host'
AND ApplicationName != 'Microsoft SQL Server Management Studio - Query'
AND ApplicationName != 'SQLAgent - Job Manager'
AND ApplicationName != 'SQLCMD'
AND ApplicationName != 'SQLAgent - TSQL JobStep (Job <Step Name> : Step 1)'
AND loginname not like '%tarun%'
AND TEXTDATA NOT LIKE '%Job Target Servers:%'
AND TEXTDATA NOT LIKE '%Job Steps:%'
AND TEXTDATA NOT LIKE 'Core Job Details:%'
AND TEXTDATA NOT LIKE 'backup%'
AND TEXTDATA NOT LIKE 'Need to wait longer.'
AND TEXTDATA NOT LIKE 'auto statistics internal'
AND TEXTDATA NOT LIKE '%result%'
AND TEXTDATA NOT LIKE 'no floor%'
AND TEXTDATA NOT LIKE 'The ''sp_ProcessResponse''
AND TEXTDATA NOT LIKE 'Database ''TARUN-PC%'
AND TEXTDATA NOT LIKE '%percent processed.%'
AND textdata not like 'Job Schedules:%'
AND textdata not like '%started successfully%'
AND textdata not like '%Processed%pages%'
AND starttime > GETDATE() - 90
Happy Coding,
Tarun Kumar Chatterjee
Leave a comment