Well, "Performance Optimization" in NAV & SQL is some sort of "Cornucopia of issues" – starting from Hardware Sizing, Platform Configuration, Index Tuning, SIFT Optimization, etc. … and – last but not least – the question: "How to prevent and solve blocking conflicts?"
At the "Directions EMEA 2008" Event in Paris I tried to give some feasible, practical advices to answer this question. The following should give some further explanations, in addition to the "Session Package" one could download from my website: http://www.stryk.info/directionsemea_blockdetection.zip
(Based on NAV 4.00 and SQL Server 2005)
### Changes 12.05.2008 ### >>>
Due to the great response I got on this BLOG I could further improve the scripts, thus the processes are more reliable, especially when it is about "translating" from "waitresource" into "table name". Additionally, now the relevant Index Names are shown, too.(Many thanks to Legosz (Mark) for his inspiration and support with this!)
### Changes 12.05.2008 ### <<<
General
"Locking" is actually some sort of reservation of a system-resource as a record, table, etc.. "Locking" is crucial to grant the consistency of data, and every database MUST lock at certain time at a certain extend. When we're talking about "locking problems" we actually mean "blocking problems"; means if two (or more) processes want to lock the same resource and a conflict is generated. Thus, we are encountering problems when a "lock" results in a "block". Hence, it is a matter of probability if/how/when such a conflict could occur. To reduce "blocking conflicts" it is necessary to reduce the probability of "locking conflicts" …
Lock Granularity
To minimize the risk of blocks it is necessary to keep the "lock granularity" (locking-level) as small as possible. Lock granules are for example Row, Range, Page, Index, Key, Table, etc.. The lower the granularity, the lower the probability of encountering blocks. For example, the probability that two processes at the same time want to work with the "Sales Header" table is much higher than the probability that these two processes want to work with the same record (the same document). Thus, if the "Sales Header" table would be locked, the risk of getting blocked if far higher than if just a single record was locked!
Usually SQL Server sets the optimal lock granularity, but with NAV this could be overruled (which is mostly a bad idea). It is a common misunderstanding that the "Always Rowlock" feature in NAV is required to lock on row-level. Regardless of this setting, SQL Server will ALWAYS start locking on the lowest possible granularity, e.g. row-level. If "Always Rowlock" is enabled, then NAV will add the query hint ROWLOCK which prevents "Lock Escalation": If SQL Server recognizes that the administration of multiple Row-Locks is costly (threshold is short before 1000 row-locks) it will escalate the locking level, means it will replace the many Row-Locks by e.g. few Range-Locks or Page-Locks, etc..
The ROWLOCK hint prevents this escalation and forces SQL Server to keep the Row-Lock granularity. Hence, "Always Rowlock" is indeed reducing blocking conflicts – due to the forced low granularity – but this could be at cost of the overall performance: all these locks have to be maintained and require a sufficient amount of RAM. I recommend to disable "Always Rowlock" and investigate and solve the arising blocking conflicts specifically, not wasting precious hardware resources.
With NAV before version 4.00 the lock granularity could also be set:
CREATE TABLE [$ndo$dbconfig] (config VARCHAR(1024))
GRANT SELECT ON [$ndo$dbconfig] TO [public]
GO
INSERT INTO [$ndo$dbconfig] VALUES ('DefaultLockGranularity=Yes')
(See also the "Application's Designers Guide")
The "Lock Granularity" is also influenced by the so called "Transaction Isolation Level" – the degree of how much a transaction is isolated from other transactions. With NAV the default level is READUNCOMMITED, means even data written by transactions that have not yet committed could be read from other transactions (aka "Dirty Reads").
With READUNCOMMITTED the risk of encountering blocks is minimal. The highest isolation level is SERIALIZABLE – here a transaction is completely isolated from others, hence it has the highest risk of generating blocking conflicts. With the NAV commands Record.LOCKTABLE or Record.FINDSET(TRUE) the TSQL statement SET TRANSACTION ISOLATION LEVEL is sent, thus the transaction is serialized. Further, after serialization NAV sends an UPDATELOCK hint which transforms the formerly set "Shared Locks" (S) into "Exclusive Locks" (X) (see "Books Online" for details).
Hence, serializing transactions could cause trouble, so it should be only done where it is really necessary, e.g. to avoid "Dirty Reads" (which is the case in all standard posting routines when Ledger Entries are created).
Automatic Block Detection
Before Blocks could be resolved it is necessary to know precisely the "Who is Who" about the conflicts; means "Who is blocking whom?", "When?" and "On which Resource?" – and of course: "How often does a conflict occur?". Solving blocking conflicts basically has to be done within the C/AL source code – and this task could be very complex and time consuming! Thus it is crucial to know about this "Who is Who" to fix the worse problems first, not spending time on singular events. The following describes one solution – there might be plenty – which I consider quite simple but smart:
Step 1 – Block Log Table
To collect the data about blocks we need a table ("ssi_BlockLog") for storage:
USE [Navision]
GO CREATE TABLE [dbo].[ssi_BlockLog]
(
[entry_no] BIGINT IDENTITY CONSTRAINT [ssi_BlockLog$pk_ci] PRIMARY KEY CLUSTERED,
[timestamp] DATETIME,
[waitresource] VARCHAR(128),
[table_name] VARCHAR(128) COLLATE database_default,
[index_name] VARCHAR(128) COLLATE database_default,
[waittime] BIGINT,
[lastwaittype] VARCHAR(128),
[spid] INT,
[loginame] VARCHAR(128) COLLATE database_default,
[hostname] VARCHAR(128) COLLATE database_default,
[cmd] NVARCHAR(255) COLLATE database_default,
[status] VARCHAR(128) COLLATE database_default,
[cpu] BIGINT,
[blocked by] INT,
[loginame 2] VARCHAR(128) COLLATE database_default,
[hostname 2] VARCHAR(128) COLLATE database_default,
[cmd 2] NVARCHAR(255) COLLATE database_default,
[status 2] VARCHAR(128) COLLATE database_default,
[cpu 2] BIGINT,
[db] VARCHAR(128) COLLATE database_default,
)
GO
Step 2 – Stored Procedure to save the data
A "Stored Procedure" "ssi_blockdetection" is used to extract the block-information from various DMV (Dynamic Management Views) and save it into table ssi_BlockLog (from step 1).
CREATE PROCEDURE [dbo].[ssi_blockdetection] AS
IF EXISTS (SELECT * FROM sys.dm_exec_requests WHERE [blocking_session_id] <> 0) BEGIN
SET STATISTICS IO OFF
/* Saving initial data */
INSERT INTO [dbo].[ssi_BlockLog]
(
[timestamp]
,[waitresource]
,[table_name]
,[index_name]
,[waittime]
,[lastwaittype]
,[spid]
,[loginame]
,[hostname]
,[cmd]
,[status]
,[cpu]
,[blocked by]
,[loginame 2]
,[hostname 2]
,[cmd 2]
,[status 2]
,[cpu 2]
,[db]
)
SELECT [timestamp] = GETDATE(),
s1.[wait_resource],
null,
null ,
s1.[wait_time],
s1.[last_wait_type],
session1.[session_id],
session1.[login_name],
session1.[host_name],
s1.[command],
session1.[status],
session1.[cpu_time],
[blocked by] = s1.[blocking_session_id],
[loginame 2] = session2.[login_name],
[hostname 2] = session2.[host_name],
[cmd 2] = s2.[command],
[status 2] = session2.[status],
[cpu 2] = session2.[cpu_time],
[db] = DB_NAME(s1.[database_id])
FROM sys.dm_exec_requests (NOLOCK) s1
LEFT JOIN sys.dm_exec_requests (NOLOCK) s2 ON s2.[session_id] = s1.[blocking_session_id]
LEFT JOIN sys.dm_exec_sessions (NOLOCK) session1 ON session1.[session_id] = s1.[session_id]
LEFT JOIN sys.dm_exec_sessions (NOLOCK) session2 ON session2.[session_id] = s1.[blocking_session_id]
WHERE s1.[blocking_session_id] <> 0
/* Updating data, Parsing "wait_resource */
DECLARE @waitresource VARCHAR(128), @dbid INT, @objid BIGINT, @indid INT, @partid BIGINT, @idxname VARCHAR(128)
DECLARE @i_db INT, @i_tab INT, @i_idx INT, @i_part INT
DECLARE waitresource_cur CURSOR FOR SELECT LTRIM(RTRIM([waitresource])) FROM [dbo].[ssi_BlockLog] WHERE [table_name] IS NULL FOR UPDATE
OPEN waitresource_cur
FETCH NEXT FROM waitresource_cur INTO @waitresource
WHILE @@FETCH_STATUS = 0 BEGIN
SET @i_db = 0
SET @i_tab = 0
SET @i_idx = 0
SET @i_part = 0
SET @i_db = CHARINDEX(':', @waitresource)
SET @i_tab = CHARINDEX(':', @waitresource, @i_db + 1)
IF @i_tab <> 0
SET @i_idx = CHARINDEX(':', @waitresource, @i_tab + 1)
IF @i_idx = 0 AND @i_tab <> 0
SET @i_part = CHARINDEX('(', @waitresource, @i_tab + 1)
IF @i_db <> 0 AND @i_tab <> 0
SET @dbid = LTRIM(SUBSTRING(@waitresource, @i_db + 1, @i_tab - @i_db - 1))
IF @i_tab <> 0 AND @i_idx <> 0 BEGIN
SET @objid = LTRIM(SUBSTRING(@waitresource, @i_tab + 1, @i_idx - @i_tab - 1))
UPDATE [dbo].[ssi_BlockLog] SET [table_name] = OBJECT_NAME(@objid) WHERE CURRENT OF waitresource_cur
END
IF @i_idx <> 0 BEGIN
SET @indid = LTRIM(SUBSTRING(@waitresource, @i_idx + 1, LEN(@waitresource) - @i_idx))
SELECT @idxname = "name" FROM sys.indexes (NOLOCK) WHERE "object_id" = @objid AND "index_id" = @indid
UPDATE [dbo].[ssi_BlockLog] SET [index_name] = @idxname WHERE CURRENT OF waitresource_cur
END
IF @i_tab <> 0 AND @i_part <> 0 BEGIN
SET @partid = LTRIM(SUBSTRING(@waitresource, @i_tab + 1, @i_part - @i_tab - 1))
SELECT @objid = "object_id", @indid = "index_id" FROM sys.partitions (NOLOCK) WHERE "partition_id" = @partid
UPDATE [dbo].[ssi_BlockLog] SET [table_name] = OBJECT_NAME(@objid) WHERE CURRENT OF waitresource_cur
SELECT @idxname = "name" FROM sys.indexes (NOLOCK) WHERE "object_id" = @objid AND "index_id" = @indid
UPDATE [dbo].[ssi_BlockLog] SET [index_name] = @idxname WHERE CURRENT OF waitresource_cur
END
FETCH NEXT FROM waitresource_cur INTO @waitresource
END
CLOSE waitresource_cur
DEALLOCATE waitresource_cur
END
Step 3 – Job to save the data
A SQL Server Agent Job will then execute the procedure "ssi_blockdetection" (from step 2) .
USE [msdb]
GO BEGIN TRANSACTION
DECLARE @ReturnCode INT
SELECT @ReturnCode = 0
IF NOT EXISTS (SELECT name FROM msdb.dbo.syscategories WHERE name=N'STRYK System Improvement' AND category_class=1)
BEGIN
EXEC @ReturnCode = msdb.dbo.sp_add_category @class=N'JOB', @type=N'LOCAL', @name=N'STRYK System Improvement'
IF (@@ERROR <> 0 OR @ReturnCode <> 0) GOTO QuitWithRollback
END
DECLARE @jobId BINARY(16)
EXEC @ReturnCode = msdb.dbo.sp_add_job @job_name=N'SSI: Block Detection',
@enabled=1,
@description=N'Automatic Block-Detection by STRYK System Improvement, http://www.stryk.info',
@category_name=N'STRYK System Improvement',
@owner_login_name=N'sa', @job_id = @jobId OUTPUT
IF (@@ERROR <> 0 OR @ReturnCode <> 0) GOTO QuitWithRollback
EXEC @ReturnCode = msdb.dbo.sp_add_jobstep @job_id=@jobId, @step_name=N'blockdetection',
@step_id=1,
@cmdexec_success_code=0,
@on_success_action=1,
@on_fail_action=2,
@subsystem=N'TSQL',
@command=N'EXECUTE [dbo].[ssi_blockdetection]',
@database_name=N'Navision' -- Change Database Name on demand
IF (@@ERROR <> 0 OR @ReturnCode <> 0) GOTO QuitWithRollback
EXEC @ReturnCode = msdb.dbo.sp_update_job @job_id = @jobId, @start_step_id = 1
IF (@@ERROR <> 0 OR @ReturnCode <> 0) GOTO QuitWithRollback
EXEC @ReturnCode = msdb.dbo.sp_add_jobserver @job_id = @jobId, @server_name = N'(local)'
IF (@@ERROR <> 0 OR @ReturnCode <> 0) GOTO QuitWithRollback
COMMIT TRANSACTION
GOTO EndSave
QuitWithRollback:
IF (@@TRANCOUNT > 0) ROLLBACK TRANSACTION
EndSave:
(Remark: the TSQL which extracts the data from the system-tables might be improved, it's just an example – please feel free to improve, I would appreciate your support!)
Step 4 – Alert to monitor "Processes Blocked"
To automatically keep track about the blocks a SQL Server Agent Alert ("SSI: Block Detection") could monitor the SQL Server Performance Counter "SQLServer::General Statistics – Processes blocked"; further it responds by executing the SQL Agent Job:
USE [msdb]
GO EXEC msdb.dbo.sp_add_alert @name=N'SSI: Block Detection',
@enabled=1,
@delay_between_responses=10,
@include_event_description_in=0,
@performance_condition=N'SQLServer:General Statistics|Processes blocked||>|0',
@job_name=N'SSI: Block Detection'
GO
Now, if the number of "Processes blocked" raises above 0 – a block was encountered – this Alert is triggered and executes the SQL Server Agent Job ("SSI: Block Detection" from Step 2) which logs the block-information - the "Who is Who of Blocks"!
THAT'S IT – AUTOMATIC BLOCK DETECTION ESTABLISHED!!!
Finally we could easily analyze this data, e.g. finding out who is cause or victim of blocks, which resources are affected, how long do the blocks take, etc.. For example:
USE [Navision]
GO
SELECT [waitresource], [table_name], [index_name], [blocked_login] = [loginame], [blocking_login] = [loginame 2], [count] = COUNT([entry_no]),[max_duration] = MAX([waittime])
FROM dbo.ssi_BlockLog
--WHERE [timestamp] BETWEEN '17.04.2008 00:00' AND '17.04.2008 23:59'
GROUP BY [waitresource], [table_name], [index_name], [loginame], [loginame 2]
ORDER BY COUNT([entry_no]) DESC
GO
SELECT * FROM ssi_BlockLog ORDER BY [entry_no] DESC
GO
(Remark: these examples a created for a database called "Navision" – please change on demand)
Solving and Preventing Blocks
Well, we have to be aware that it is impossible to solve all kinds of potential blocking conflicts, as it is impossible to predict all probably – and improbably – circumstances of conflicting processes. Means: there will always be blocks, we only can reduce the problems down to a certain degree. Due to the way how certain processes are designed in NAV – e.g. posting routines – the success is limited to the number of concurrent users running these transactions in parallel. To reduce blocks these optimizations are necessary:
- SIFT Optimization
(see my BLOG http://dynamicsuser.net/blogs/stryk/archive/2007/10/28/included-columns-vs-sift-tables.aspx for further details) When SIFT data is updated (aggregation) the system has to establish tight locks to make sure only committed data is included when summing up the figures. The more aggregation levels (buckets) exist, and the higher this aggregation happens, the higher is the risk of encountering blocks. (Remark: With NAV 5.0 SP1 – Indexes Views – this is void)
- Index Optimization
If the index structure in insufficient usually transactions take much longer because "Index Scans" are performed instead of "Index Seeks". "Scanning" indexes – reading the leaf node level – is time consuming, and has a high risk of getting blocked, e.g. if a write transactions locks a record within the "to-be-scanned" range. Optimizing indexes speeds up transactions and reduced blocking conflicts.
- Process Optimization
It is important to optimize the "length"/duration of processes and transactions in NAV. The longer a transaction takes, the more locks are engaged (lock escalation), the higher the risk of causing/encountering blocks. Also, the longer transactions take, the more system resources are occupied. Shorting transactions with a COMMIT – at the right time and place! – will release locks and resources.
- Increase Processing Speed
The faster a transaction could be processed, the lower is the risk of encountering blocks. So speeding up the system in any way (CPU, RAM, Disks, Network, etc.) could be quite feasible, too. When it is about "Index Optimization" especially the design of the Clustered Index could have major impact on blocks! The following should give few examples:
Example 1:
Table 357 "Document Dimension" Clustered Index (Standard): Table ID, Document Type, Document No., Line No., Dimension Code
Looking at the physical structure of the table, the records are actually stored in this way:
|
Process A |
Table ID |
Document Type |
Document No. |
Line No. |
Dimension Code |
Process B |
|
|
36 |
Order |
100 |
0 |
Dim1 |
|
|
|
36 |
Order |
100 |
0 |
Dim2 |
|
|
|
36 |
Order |
200 |
0 |
Dim1 |
|
|
|
36 |
Order |
200 |
0 |
Dim2 |
|
|
|
36 |
… |
… |
… |
… |
|
|
|
37 |
Order |
100 |
10000 |
Dim3 |
|
|
|
37 |
Order |
100 |
20000 |
Dim4 |
|
|
|
37 |
Order |
200 |
10000 |
Dim3 |
|
|
|
37 |
Order |
200 |
20000 |
Dim4 |
|
|
|
37 |
… |
|