NAV/SQL Performance - My Two Cents

Jörg Stryk (MVP - Dynamics NAV) about Performance Optimization & Troubleshooting with MS Dynamics NAV and MS SQL Server

Hi all,

Puuuuhh, that was quite a rush … originally my idea about the TA was just to attend, but unfortunately one of the speakers got sick, and fortunately MS asked me if I could be available …
<big grin> Of course I was available! </big grin>
The pity was, I just had half a day to prepare my session, but according to the great feedback I got so far, my session was very much appreciated!

Edit 10.02.2010: Finally - the Video (German): https://training.partner.microsoft.com/learning/app/management/LMS_ActDetails.aspx?UserMode=0&ActivityId=569708

Like some time ago at "Directions EMEA 2008" I followed my idea of NOT presenting the "general-performance-issue-bla-bla" (well, I just talked briefly about the most important things) BUT I focused on a specific topic – this time:

Optimizing Indexes

  • Understanding Indexes
  • SQL Server Profiling (Basics)
  • Designing Indexes
  • Maintaining Indexes

 

I apologize, but currently I don't have the time to BLOG the complete session content, so please consider this a "preliminary posting".

OK, here we go – as promised, here the full session content – except the "basics" - (plus a little bit more; actually I prepared more demos and I got more things to say, but I ran out of time … again, I had not much time to prepare … next time I'll skip the general stuff completely J )

Understanding Indexes

First of all it is necessary to distinguish between a "Key" and an "Index" – in native C/SIDE we're always just talking about "Keys", but with SQL Server it is crucial to understand that both are quite different things:

A "Key" is a constraint. A sort of rule for a record or field.
The most famous "Key" is the "Primary Key" (PK) – here the constraint (rule) is, that every PK value must be unique. There are other "Keys" in SQL Server, e.g. "Foreign Keys" (FK), which do not apply to NAV out-of-the-box:
In NAV there are just logical "Table Relations", actually defined in table 2000000041 "Field", but these are not real constraints (from a SQL Server perspective).
With enabling the NAV database property "Maintain Relations" the logical "Table Relations" are converted into real "Foreign Key" relations on SQL Server. This could have advantages regarding data integrity, but the downside is, that all these FK have to be checked with all write transactions – and this could degrade performance.
I recommend NOT to use the "Maintain Relations" features as this affects the whole database. If real "FK" are required, then those should be deployed manually only where necessary.

An "Index" is an internal structure (Balanced Tree, B-Tree) to help to retrieve data.

While in native C/SIDE "Keys"/"Indexes" are used for various purposes (Sorting, Data Retrieval, SIFT), SQL Server uses Indexes exclusively for Data Retrieval.

An example:

Imagine a book, e.g. "Yellow Pages". If you want to look up some "plumber" companies, you firstly would look up the "index" in that book – probably an alphabetical listing telling you that the companies of the branches starting with a "P" are on page 182. Once you've learned the page number you could quickly switch to that page to look up the relevant companies.
Technically spoken, you were performing an "Index Seek" and then a "Lookup" to retrieve the data.
Well, SQL Server uses Indexes in quite the same way.

Seeking the right index and looking up the data was actually quite fast as you had to read only few pages – technically: your logical and physical I/O was minimal.

Now tear out the index from this book (DROP INDEX J). If you want to look up the "Plumbers" now, you don't have this helping structure any more. Hence, you have to read page by page until you finally reach the "Plumbers".
Technically: you had to "Scan" the table, you had to read many pages, your logical & physical I/O was dramatically higher than before – your performance was quite degraded.

After all at least you found your desired result in – more or less – reasonable time. But this was only possible because the data – the braches, companies and addresses – is arranged in a specific sorting order, the records are sorted in alphabetical order.
Technically, the data was "clustered", we have a "Clustered Index".
If there was no "Clustered Index", this would mean all data would be written totally mixed up. Tear out all pages from the book and throw them wildly into the room. Hence, to look up all "Plumbers" now, you may have read quite a lot of pages (probably all of them), and as the pages are now spread all over your workload – logical & physical I/O - is maximal, taking a long time.

This little story should just highlight the importance of having a "Clustered Index" (CI) and optimized "Non-Clustered Indexes" (NCI) to improve the query performance, and how both indexes are working together.

Some more definitions:

Clustered Index (CI):

The CI is a B-Tree built of indexed values. The sorting order of the CI defines the sorting order of records within the table. The "Leaf Node" level of the CI is actually the table data, it contains all fields/columns. Each table can only have one CI.
If no CI is available the table is called a "Heap".

Non-Clustered Index (NCI):

The NCI is a B-Tree of indexed values. If additional fields have to be retrieved – other columns than the indexed ones – then the data has to be looked up from the table, respective the CI. To perform this "Key Lookup" the "Leaf Node" level of each NCI implicitly contains the CI information. If the table is a "Heap", the NCI "Leaf Nodes" include the binary addresses of the data pages/records.

In NAV C/SIDE "Keys" are automatically transformed into SQL "Indexes":

C/SIDE

SQL Server

Primary Key

Primary Key / Clustered Index

Secondary Key

Non-Clustered Index UNIQUE

Table Relation (Maintain Relations TRUE)

Foreign Key

 

With this standard "translation" the NCI are flagged UNIQUE, which is granted by adding all PK fields to the NCI (if not already part of the Index).

For example: table 17 "G/L Entry"

PK:    "Entry No."                è CI:    "Entry No."
SK:    "G/L Account No.", "Posting Date"    è NCI:    "G/L Account No.", "Posting Date", "Entry No." (UNIQUE)

If the UNIQUEness is only granted by the PK value, it is actually pointless. In such a case any NCI value is unique, how should a NCI value become non-unique? The PK fields grant UNIQUEness, that's the purpose of a PK …
Thus, the UNIQUE flag is just wasting CPU power, but on the other hand the NCI are (IMHO) unnecessarily expanded.

The usual way of creating CI and NCI could be overruled in NAV by using the "Key" properties "Clustered" (to use another Index instead the PK) and "SQL Index". With "SQL Index" a different index structure could be defined.
But caution: using the "SQL Index" property in a wrong way will cause severe performance issues (as shown in NAV 5.0)!

Microsoft is explicitly warning about this property, recommending not to use it at all (see various BLOGS from "NAV Development Team" or "NAV Sustained Engineering Team"). Do not rearrange the sorting of Index fields different from the "Key" setting, and do not remove any fields – even though you might think about selectivity etc..
In this case, SQL Theory and NAV Reality are not the same …
The only 98% failsafe way to use this property is to copy the "Key" setting to the "SQL Index" 1:1. This is not a full index optimization, it will just remove the UNIQUE thing and the trailing PK fields (I call that "Basic Streamlining").

 

SQL Profiling (Basics)

Hence, index optimization just out-of-the-box is somewhat risky. So, before optimizing an index it is necessary to identify the problem first. The perfect tool for this is the SQL Server Profiler.
Recommended setup:

Events

Columns

Filters

Stored Procedures

RPC: Completed

SP: Completed

SP: StmtCompleted

 

TSQL

SQL: BatchCompleted

SQL: StmtCompleted

SPID

TextData

Reads

Writes

CPU

Duration

Start Time

End Time

LoginName

HostName

Application Name

Database Name

EventClass

Application Name

Not Like %SQL%

 

Reads

Greater than or equal

1000

 

Duration

Greater than or equal

30

(change filers on demand)

[you'll find some templates in the attachment]

The important part is the filter on "Reads" – see the "Yellow Pages" example above. Huge "Reads" mostly indicate "Index Scans" which should be fixed. The Trace should be always saved to file (TRC), do not save to table as this would cause even more load on the server. Also, a TRC is portable, e.g. you could trace on customer's site and investigate the TRC in the back office.

Once the TRC has been created it could be analyzed. Of course, it would be rather pointless to read this TRC line by line (maybe thousands of lines) to find out which queries might require in index-fix. With a few lines of TSQL such a TRC could be "parsed", showing which queries consume how many "Reads" and how often (!) they were executed. So it is possible to separate the singular events (e.g. due to wrong user interaction/filtering) from the persistent/periodic problems.
The script "TA2009_Example_TraceCheck.sql" (attachment) shows how this could be accomplished (loading the TRC into table and investigate).

If the "Bad Queries" could be identified they could be investigated a little bit more detailed:

With NAV we need to record two "Events":
a) "SP: StmtCompleted" – this will contain the query statement (SELECT etc.), but the WHERE clause will just include placeholder (@P1 etc.) instead of the real filter values.

Example:

SELECT * FROM "NAV_TA2009"."dbo"."CRONUS 403$Document Dimension" WITH (READUNCOMMITTED)

WHERE (("Table ID"=@P1 OR "Table ID"=@P2))

AND (("Document Type"=@P3)) AND (("Document No_"=@P4))

ORDER BY "Table ID","Document Type","Document No_","Line No_","Dimension Code"

 

b) "SP: RPC Completed" – this includes the procedure call where the real filter values are handed over.

Example:

exec sp_cursorexecute 1073743220,@p2 output,@p3 output,@p4 output,@p5 output,36,37,1,'100004859'

 

To rerun the query both information could be combined to one ad-hoc query, replacing the placeholders with the actual values.

Example:

SELECT * FROM "NAV_TA2009"."dbo"."CRONUS 403$Document Dimension" WITH (READUNCOMMITTED)

WHERE (("Table ID"=36 OR "Table ID"=37))

AND (("Document Type"=1)) AND (("Document No_"='100004859'))

ORDER BY "Table ID","Document Type","Document No_","Line No_","Dimension Code"

 

This ad-hoc query could be executed to verify the "Logical & Physical I/O", the "CPU time" (optional) and the "Query Execution Plan".
I/O will be shown if SET STATISTICS IO ON was executed before, the CPU time with SET STATISTICS TIME ON, and the QEP is displayed in SSMS with "Ctrl+M".

Example QEP:

 

QEP have to be read from right to left. Here the various operations could be seen especially which index ops. In the example it was a "Clustered Index Seek" (a "Mouse Over" will show further details).

OK, I have to admit, this was actually a bad example as there isn't much to fix – the idea was just to introduce a few basics.

 

Designing Indexes

Here another – better – example:

SELECT SUM("Quantity"),SUM("Invoiced Quantity"),SUM("Remaining Quantity")

FROM "NAV_TA2009"."dbo"."CRONUS 403$Item Ledger Entry"

WITH (READUNCOMMITTED)

WHERE (("Item No_"='1906-S')) AND (("Variant Code"='')) AND (("Location Code"='YELLOW'))

 

This query takes 98 Reads (have in mind that this is just a "CRONUS" fake, not a real problem), and causes a "Clustered Index Scan" (sort of table scan, so to speak):

 

The query supposedly originates from some CALCSUMS command, and obviously a related SIFT or VSIFT structure is missing, hence, the sums have to be calculated directly from the "Item Ledger Entry" table.
Anyway, standard NAV proceedings would be to create another "Key" with the required "SumIndexFields" etc., but actually this would be quite a fuss, causing more workload (SIFT table or VSIFT View).

The smarter way would be to apply an optimized index directly with TSQL!
Some basic rules for creating indexes:

  • NEVER name customized SQL site indexes "$[Number]" - use own naming-convention apart from NAV's
  • Equality Columns (=) first, then In-Equality (<, >, <>, LIKE) Columns
  • Highly selective fields first
  • Caution: Regard NAV "Key" context = ORDER BY clause
    • "Fast Forward Cursors" are optimizing for the WHERE CLAUSE
    • "Dynamic Cursors" are optimizing for the ORDER BY
  • Again: "SQL Theory" and "NAV Reality" are not always the same!

>> 

Excursion: Something about Selectivity

The more different values of a field are available, the higher is the “selectivity”. High “selectivity” means small “density”. A high “density” (= low “selectivity”) means that many records have the same value of a certain field.
Example 1: A Primary Key of “Entry No.” (Integer) has the smallest “density” possible – every record has a different “Entry No.” value, every value is unique. Thus, “Entry No.” has a high “selectivity”.
Example 2: A Key Field “Open” (Boolean) has a pretty high “density”: 50% TRUE, 50% FALSE; the field “Open” has a very low “selectivity”.
The higher the “selectivity” the better (I skip further details here). You could check the “density” with DBCC SHOW_STATISTICS (see also attached “
TA2009_StatisticsSelectivity.sql”)

More examples:

CREATE INDEX ssiTA_1 ON "dbo"."CRONUS 403$Sales Header"

("Document Type", "No_")

GO


This index has “Document Type” (Integer) in the beginning; let’s verify the “density”:

DBCC SHOW_STATISTICS('CRONUS 403$Sales Header', 'ssiTA_1')

GO

Document Type” is supposed to be rather non-selective, there are not many different values, thus each value has a pretty high density. In this case the “density” of 0.33 means, that actually there are just three different values used, hence each value applies to 33.3% of all records.

Now let’s change the index (actually I create a new one) according to “selectivity-rules” and check the “density”:

CREATE INDEX ssiTA_2 ON "dbo"."CRONUS 403$Sales Header"

("No_", "Document Type")

GO

DBCC SHOW_STATISTICS('CRONUS 403$Sales Header', 'ssiTA_2')

GO

Now that’s quite different! The “density” is much smaller than before! (Please regard that the total density of both fields combined is always the same, of course.)
This is, because there are 5702 different values of “No.”. In total, the table has 5703 records, so almost every record has a different “No.” value .

So why is this important, as the combined density is always the same?
SQL Server always tries to retrieve as few records as possible. Which index is optimal for this operation, it learns from those “Index Statistics”. If such a statistics tells the Server quite in the beginning, that maybe quite a lot of records might be retrieved – e.g. 33% is quite a lot – it could decide to not use the index. In the worst case this could lead to “(Clustered) Index Scan” operations, taking long time, high Reads etc..
Or in other terms: high selective fields in the beginning of an index make this index “more attractive”.

<< 

Back to our original example ...

In our example we don't have a "Cursor Issue", there's even no ORDER BY clause. Thus, in this case I could strictly follow the SQL rules to create this index:

CREATE INDEX ssiTA_1 on "CRONUS 403$Item Ledger Entry"

("Item No_", "Location Code", "Variant Code")


Rerunning the query shows this QEP:

 

OK, the new index (ssiTA_1) is used (Mouse Over), the "Reads" have been remarkably reduced, from 98 down to 2 (!!!).
The QEP shows, that an "Index Seek" is performed on the new index, but also a "Key Lookup" was performed: The new index just contains the fields from the WHERE clause, but not sum-fields ("Quantity", etc.). Hence, to calculate the sums the values have to be taken from the "Clustered Index" (Leaf Nodes = Table).
Well, so far we are actually compatibly with NAV, means we could have created this index also in NAV … but with SQL we could also use additional features – here a second, even better index:

CREATE INDEX ssiTA_2 on "CRONUS 403$Item Ledger Entry"

("Item No_", "Location Code", "Variant Code")

INCLUDE

("Quantity", "Invoiced Quantity", "Remaining Quantity")

 

Here the "B-Tree" is still build with the same fields ("Item No_", "Location Code", "Variant Code"). But now the sum-fields ("Quantity", "Invoiced Quantity", "Remaining Quantity") are added as so called "Included Columns" – these fields are attached to the "Leaf Node" of the NCI.
Here the new QEP:

 

The new index (ssiTA_2) is used ("Index Seek"), and the "Key Lookup" operation vanished: it is not necessary to read the sum-fields from the CI anymore, as the data is available "at the end" of the NCI! Hence, the operation is even faster than before!

Using "Included Columns" could have great advantages with NAV, especially with SIFT and VSIFT management.
But caution: excessive usage of INCLUDE will create very large index objects, expand the db size and probably degrading write performance. Handle with care (beware of the "Tuning Advisor" & Co.)!

So far we have learned that our new index is working well in SSMS. But ad-hoc queries perform actually quite differently than RPC queries from NAV (Cursors & Co.), thus, even though our tests were successful, this is no guarantee that the index is working in "real life" with NAV, too. We could find out by using the SQL Servers "Index Usage" statistics (sys.dm_db_index_usage_stats):

SELECT object_name(IDX."object_id") AS "object", IDX."name",

IDXUSG."user_seeks", IDXUSG."user_scans", IDXUSG."user_lookups", IDXUSG."user_updates"

FROM sys.dm_db_index_usage_stats IDXUSG

JOIN sys.indexes IDX ON IDX."object_id" = IDXUSG."object_id" AND IDX."index_id" = IDXUSG."index_id"

WHERE IDXUSG."database_id" = db_id('NAV_TA2009') – change db name

AND IDX."name" LIKE 'ssiTA%' -- regard your naming convention

If we could see "User Seeks" (or somewhat worse "User Scans") we have confirmation our index was indeed needed. If not, well, then we have to continue our "Quest for Optimized Queries" …

Finally, on one hand we have to create indexes to solve certain query issues – improving read-performance -, but actually this would also decrease the write-performance, as every index is additional load (aka "Cost per Record").
Technically this is true. But I still encourage you to add indexes whenever needed – don't be shy: for every write transaction in NAV there might be thousands of reads. For example, a "G/L Entry" is inserted just once, maybe updated a few times (if at all!), but probably read thousands of times. IMHO read-performance is more important than write-performance (yes, yes, I know … this depends …).

And after all, with the same DMV "sys.dm_db_index_usage_stats" it is possible to determine which Indexes are unused, and those could be disabled/deactivated in NAV. Hence, it is possible to give & take indexes, to finally have an optimzed system, outbalanced for read- & write-performance.
(Have in mind that the DMV data is reset whenever the SQL Service is restarted. On a freshly started SQL Server actually all indexes are supposedly "unused" J)

 

Maintaining Indexes

First of all it is crucial to maintain "Index Statistics" – see my BLOG about details: http://dynamicsuser.net/blogs/stryk/archive/2008/08/01/index-statistics.aspx
The "bottom line" is to disable the "Auto. Statistics" stuff and to execute a maintenance job (once per day) instead:

EXEC sp_updatestats

GO

EXEC sp_createstats 'indexonly'

GO

Indexes will fragment during time – data is inserted, modified and deleted, and "Page Splits" are performed.

Page Splits:
By default SQL Server is filling the data page for almost 100%. This keeps the indexes small, providing best read performance. If data has to be inserted into a full page, SQL Server has to split this page: about 50% of the data is copied into a new allocated page, releasing free space in the original page, thus, the new data could be inserted.
These Page Splits degrade write performance and increase the index fragmentation (later more about that).

Indexes could be defragmented in many ways, usually done via "Maintenance Plans". Here two tasks are available:

  • Index Rebuild:
    Complete defragmentation of the indexes (B-Tree incl. Leaf Nodes). Best results, but highest impact on the system regarding Log Usage, I/O, Duration and Locking.
  • Index Reorganization:
    Defragments only the Leaf Nodes. Moderate results – the B-Tree stays fragmented -, but less impact on the system.

Well, if this is the choice, the I recommend the REBUILD method. But there's an even smarter option available: using the DMV "sys.dm_db_index_physical_stats" the actual "fragmentation degree" could be determined!
With this it is possible to perform a more advanced defragmentation, following the "best practice" scale:

  • Logical Fragmentation less than 10%            è acceptable, no action required (skip)
  • Logical Fragmentation greater 10% less than 30%    è medium fragmentation, perform REORGANIZE
  • Logical Fragmentation greater than 30%            è heavy fragmentation, perform REBUILD

The related entry in "Books Online" provide a great example script to accomplish this – the "TA2009_IndexDefragmentation.sql" (attachment).

To reduce the amount of "Page Splits" it is feasible to apply an "Index Fill-Factor" smaller than the standard 100%, leaving some free space for faster data insertion. With the corresponding MP task "Index Rebuild" this amount of free space could be defined; best practice recommendation is to apply 5% to 10% (= Fill-Factor of 95% or 90%).
The attached script does not take care about FF, if required the FILLFACTOR clause has to be added (e.g. WITH FILLFACTOR = 90). Have in mind that those available standard features cannot apply specific FF according to the NAV requirements – sequential indexes (e.g. "Entry No.") might need a different FF (100%) than other indexes (FF on basis of relative index growth).
This could be performed with specific tools – check out "The NAV/SQL Performance Toolbox".

 

Sooooo, that was another long story … I'm aware that there might be other things to add, or some things to discuss… but I hope this gives you some deeper insight into the area of Indexes and QEP.

Please find attached the presentation slides and the scripts & templates I used during that session – the "content" will follow ASAP (hey, the session was taped on video, once it is available I'll link to that, too, of course J).
Additional resources: "The NAV/SQL Performance Field Guide"

(As always: use the stuff at your OWN RISK; NO GUARANTEE, NO WARRANTY, NO SUPPORT)

Your feedback is very much appreciated!

Technical Airlift 2009.zip