Optimizing SIFT and VSIFT

Well, I guess SIFT/VSIFT tuning is also some kind of standard task, and even though I previously – briefly - explained something about this I never got into the details. This is what I'd like to do now.

1. What's "SIFT"?

SIFT is for "Sum Index Flowfield Technology", hence it is a technology used to display "FlowFields" in NAV. The idea of "FlowFields" (FF) is to not save any sum-values within a table, but to calculate these sums "on-the-fly". With the old native C/SIDE Server (RIP) this was implemented with a smart algorithm to quickly calculate any sum – see this Wiki about details: http://de.wikipedia.org/wiki/Microsoft_Dynamics_NAV (German)

With SQL Server this is not possible (?) anymore, a different way to "calculate sums" was introduced: SIFT Tables. These tables were created to physically save the sum values on specific aggregation levels ("buckets") to be used with FF. Later on (since NAV 5.0 SP1) "SIFT Tables" were replaced by "SIFT Views" – the VSIFT. More about this later …

2. Why SIFT/VSIFT anyway?

Actually the best case for a table – e.g. a Ledger Entry table – would be to have no SIFT/VSIFT at all. But only from a "write-performance" perspective:
If only a record is inserted (or updated etc.) into the Ledger the writing is as quick as can be – no more additional costs.

Example (really simplified):

Entry No. 

Item No. 

Posting Date 

Quantity 

29.05.2010 

+1 

29.05.2010 

+1 

29.05.2010 

+1 

A

29.05.2010 

+1 

29.05.2010 

+1 

… 

… 

29.05.2010 

+1 

10.000 

29.05.2010 

+1 


Here about 10.000 entries are created for Item "A" on one day (29.05.2010), each time with a "Positive Adjustment" of +1. So again, from a "write" perspective is all OK. But now let's assume we need to query the total "Inventory" (= the sum of all quantities) for item "A" on 29.05.2010. In this case NAV (actually SQL Server) has to read all 10.000 records to calculate the sum. Hence, the "read-performance" could be degraded, if a huge amount of single records have to be read.

The idea of SIFT now is to avoid this degraded read performance by providing pre-calculated sums, thus, to aggregate the data. With SIFT, with every insertion into the Ledger table a sum record was updated in a related SIFT table:

Example (again, really simplified and using understandable names):

Bucket 

Item No. 

Posting Date 

Sum Quantity 

4

29.05.2010 

+1 +2 +3 +4 +5 +… +10.000


The important thing to understand is, that each Ledger record creates not a new record in the SIFT table, but only updates the Sum!
If NAV now needs to calculate the "Inventory" it does not have to read 10.000 Ledger records but only 1 SIFT record – hence we have some kind of "compression" about 10.000 : 1 .
So on the one hand we had to spend some additional "costs" when writing the Ledger (as we now additionally maintain the SIFT), but we have performance gain when reading the sums!

But then the pain starts …

3. What's the problem with SIFT?

In the example above we were aggregating the data on basis of the "Item No." and the "Posting Date" on a daily level, having a compression of 10.000 : 1 which is good. Now if we need to calculate the "Inventory" on a monthly level, we could think in a similar way (the NAV "thinks" out of the box):

Why not create another "bucket" on a "Month" aggregation; something like this?:

Example (I guess I don't have to mention this is simplified):

Bucket 

Item No. 

Posting Date 

Sum Quantity 

3

May 2010 

+1 +2 +3 +4 +5 +… +10.000 +… +300.000


(assuming 30 days, each with 10.000 Ledger entries = 30 x 10.000 = 300.000)
Well, yes, this actually would give us a "compression" of 300.000 : 1: So we spent even more costs when writing the data, but having a benefit when reading the data … Right? No, not at all …

If no "Month" bucket is available, the NAV is not that stupid to read 300.000 single Ledger records instead. If the "Day" bucket exists, it will simply read 30 "Day-SIFT" records to sum up.
So the difference actually is 1 "Month" record versus 30 "Day" records. And for the SQL Server this is not a difference at all – reading 30 records instead of 1 maybe makes a difference of several microseconds and few bytes more to read.

Hence, with enabling the "Month" aggregation we have higher costs than benefit – in total performance gets degraded!

All higher aggregation levels could be calculated from the next lower bucket available. If no SIFT bucket is available, then the lowest "bucket" is actually the table itself!

Out of the box NAV creates way too many SIFT buckets, which are actually useless – just causing high "costs" when writing data, expanding the SIFT tables dramatically, which finally ends up in degraded read- & write performance and could cause a hell of blocking and deadlocking issues ….

In our example we would have a Sum Index Key like this …

Key 

SumIndexFields 

Item No., Posting Date 

Quantity 


… which would result in these SIFT buckets (SIFTLevels):

Bucket 

SIFT Level 

Maintain 

GRAND TOTAL 

 

Item No. 

Item No., Posting Date: Year 

Item No., Posting Date: Month

Item No., Posting Date: Day 

Item No., Posting Date: Day, Entry No. 

 


Caution: the last bucket – here number 5 – always includes the whole "Primary Key" (PK). As the PK is unique an aggregation on PK level is not an aggregation. Hence, this bucket would create as many SIFT records as there are records in the Ledger table! It is disabled by default, and this must remain that way! (GRAND TOTAL is also pointless, never enable this one).

Thus, above is what we have out-of-the-box and this is what we actually need:

would result in these SIFT buckets (SIFTLevels):

Bucket 

SIFT Level 

Maintain 

GRAND TOTAL 

 

Item No. 

 

Item No., Posting Date: Year 

 

Item No., Posting Date: Month 

 

Item No., Posting Date: Day 

Item No., Posting Date: Day, Entry No.

 


Only in very few cases and only on very large tables it might be necessary to enable another bucket. But I recommend: first get rid of all those high SIFT buckets, leaving only one. Then see what happens, and if some problems arise, these could be detected and fixed very quick.

Once a SIFT System (NAV 2.65 to 4.0SP3/5.0) has been optimized that way you will experience a dramatic performance boost – disabling those SIFT buckets removes thousands or millions of records from the SIFT tables which finally increases read- and write-performance and gets rid of many blocking and deadlocking issues!

4. SIFT is dead – long live VSIFT

Since NAV 5.0 SP1 VSIFT replaces SIFT. Means: the old SIFT tables do not exist anymore, now there are "Indexed Views" to aggregate the sums on-the-fly (more or less). So what's the difference?

A "View" in SQL Server could be seen as a pre-defined SELECT query – and indeed it is some kind of SELECT SUM() from the Ledger table. Thus, a View does not contain any data, it just displays it from another source (e.g. the Ledger table) while doing the aggregation instantly. To improve the performance of these VSIFT Views a "Clustered Index" is created which finally "materializes" the sum-data (I don't want to explain this here further on, please refer to my other articles about "Indexes").

When this VSIFT View is summing the data it groups it by the defined "Key". In our example it could look somewhat like this:

CREATE VIEW "Ledger$VSIFT"
WITH SCHEMABINDING
AS
SELECT "Item No_" AS "Item No_",
       "Posting Date" AS "Posting Date",
       SUM("Quantity") AS "SUM$Quantity"
FROM "Ledger"
GROUP BY "Item No_", "Posting Date"

Now take a look at this: the data is grouped on exactly the same level as our previous SIFT bucket number 4!

VSIFT is maintaining the sum-data on the same level like an optimized SIFT structure!

This finally proofs that our SIFT optimization was correct. And this is actually the reason for experiencing a remarkable performance boost when upgrading from SIFT (non-optimized) to VSIFT – for the same reasons as described above. On the other hand this also means, that replacing optimized SIFT with VSIFT does not necessarily give any performance improvement, in few cases even the opposite could be the case …

The major difference is actually within the "writing" of the sum data: with SIFT this was accomplished by some SQL-site Trigger (nothing to do with NAV triggers) which was actually a pain, taking too long time, causing blocking etc.. With VSIFT only the related "Clustered Index" has to be updated which is tremendously faster and causes less blocks etc..

5. Zero-Sum Records

Example: Imagine these postings – a "Positive Adjustment" and a "Negative Adjustment" in the Ledger table:

Example (really simplified):

Entry No. 

Item No. 

Posting Date 

Quantity 

29.05.2010 

+1 

29.05.2010 

-1


With SIFT this would give such a result …

Bucket 

Item No. 

Posting Date 

Sum Quantity 

29.05.2010 


… and also the VSIFT would query a total of "zero" (0). But a "zero" is void when creating a sum: 1 + 0 + 2 is exactly the same as 1 + 2. Hence, these "Zero-Sum" records – physical with SIFT, virtual with VSIFT – have no logical sense, thus they are just expanding the SIFT tables or VSIFT views unnecessarily, causing degraded performance.
As with SIFT these records are real physical ones we could delete them with some maintenance tasks; e.g. some stored procedure provided here: http://www.mibuso.com/dlinfo.asp?FileID=812
(caution: some database versions have a bug which prohibits any SIFT maintenance – corrupt versions are 60, 61, 62, 80, 81)

With VSIFT there are no more physical records to delete. Hence, in some cases it could happen that VSIFT have many more records than optimized & maintained SIFT, which could mean that reading from such "ineffective" VSIFT could be worse than with the old SIFT …

With VSIFT it is recommended to find out these "ineffective" VSIFT, means those which mostly contain "Zero-Sums". This could be done e.g. with a little SQL query.

Example:

SELECT COUNT(*) AS "TotalRecs" FROM "Ledger$VSIFT"
SELECT COUNT(*) AS "ZeroSumsRecs" FROM "Ledger$VSIFT"
WHERE "SUM$Quantity" = 0

(Have in mind that you have to check for those records where all SUM fields equal zero! To thoroughly analyze a whole database you might need some tools: http://www.stryk.info/english/toolbox.html )

Once you've identified the VSIFTs which e.g. contain more than 80% "Zero-Sum-Records" you could get rid of them by setting the "MaintainSIFTIndex" to FALSE on the corresponding NAV key.

6. VSIFT on Primary Keys

As explained with some SIFT examples above, it is rather pointless to aggregate sums on a PK level as this actually is no aggregation – this will always create as many records in the SIFT/VSIFT as the source tables contains itself, hence it doesn't make a difference whether the data is queried from the SIFT/VSIFT or the table itself; in this case the SIFT/VSIFT is just additional "costs".

With SIFT the PK bucket is disabled by default anyway; but as VSIFT group on "Key" level there exist such unnecessary VSIFTs.
To make a long story short: on PK the "MaintainSIFTIndex" should be set to FALSE.

7. Improving SIFT/VSIFT Performance

Still there is a chance that you encounter "issues" with SIFT or VSIFT; actually you might see queries on SIFT/VSIFT which take too long and cause a huge number of "Reads" (use SQL Profiler to find out, see my other articles about "Indexes"). Reasons could be missing SIFT "buckets" or VSIFT at all. If a "SIFT bucket" (some higher aggregation level) is just missing, then the NAV "standard" way would be just to (re-)enable the designated "bucket"; which is not necessarily the smartest thing to do. Also, with VSIFT there are no more "buckets" to adjust.

A smart way to fix problematic SIFT/VSIFT queries could be to add optimized SQL Server site Indexes – indexes we unfortunately cannot create from NAV site.

Some example:

(Primary) Key

SumIndexFields

MaintainSIFTIndex

Item No.

Quantity

TRUE FALSE


Out-of-the-box this would create a stupid PK VSIFT so "MaintainSIFTIndex" was disabled. Still it might be necessary to support SUM queries (now on the table itself):

CREATE INDEX "MyIndex01" ON "Table"
("Item No_")
INCLUDE
("Quantity")

Also, sometimes not all "Key" fields of a VSIFT are filtered which could result in "bad queries"; something like …

SELECT SUM("SUM$Quantity")
FROM "Ledger$VSIFT"
WHERE "Item No_" = 'A'

(refers to the examples above; no filter on "Posting Date")

… might be improved by this index:

CREATE INDEX "MyIndex01" ON "Ledger$VSIFT"
("Item No_")
INCLUDE
("SUM$Quantity")

And the same actually applies to the SIFT table queries, too. But as you see, now we're leaving the "SIFT/VSIFT" matters, slipping into "Index Optimization" – please refer to these about further information: http://dynamicsuser.net/blogs/stryk/archive/tags/Indexes/default.aspx

In the "worst case" you need to create another "Key" to create a VSIFT aggregating on a higher level. For example:

This was our original Sum-Index-Key, aggregating on "Item No." and "Posting Date":

Key 

SumIndexFields 

MaintainSQLIndex

MaintainSIFTIndex

Item No., Posting Date 

Quantity 

TRUE

TRUE


To install a VSIFT only on "Item No." level a new "Key" is created:

Key 

SumIndexFields 

MaintainSQLIndex

MaintainSIFTIndex

Item No.

Quantity 

FALSE

TRUE


Hence, the new VSIFT will "GROUP BY [Item No_]". As the SQL Index (nothing to do with the SIFT/VSIFT) is already covered by the original Index "MaintainSQLIndex" is set to FALSE.

 

Conclusion:

The old SIFT are a real pain – out-of-the-box. Standard SIFT is a major reason for overall poor performance. So optimizing SIFT will give a tremendous performance boost – the same boost you might experience when upgrading from Standard-SIFT to VSIFT (for the same reasons). If you have already an optimized SIFT system, then do not expect any dramatic improvements just by changing to VSIFT! There should be more reasons for an upgrade than just SIFT vs. VSIFT.
After all, SIFT is "dead" – NAV 4.0 is not supported anymore (even though "old" does not mean it's bad), all newer versions are running VSIFT, and meanwhile this works stabile and without major problems – and potential issues could be fixed easily (more or less).

As always, your feedback is very much appreciated!

Posted: May 29, 2010 11:20 by Jörg A. Stryk | with 4 comment(s) |
Bookmark and Share

Comments

# re: Optimizing SIFT and VSIFT

Hi,

Where can we find details of these bugs mentioned...

(caution: some database versions  have a bug which prohibits any SIFT maintenance – corrupt versions are 60, 61, 62, 80, 81)

Thanks,

Steve

Saturday, May 29, 2010 12:36 PM by Steve

# re: Optimizing SIFT and VSIFT

Nice article, but I often experience that I have to create a bunch of dummy keys in NAV (with MaintainSQLIndex=No) to create the VSIFTS that fits the data and its usage. I really don't see why Microsoft didn't keep the VSIFTLevelsToMaintain. :-(

Saturday, May 29, 2010 2:53 PM by Peter

# re: Optimizing SIFT and VSIFT

@Steve:

The affected versions are all (?) NAV 4.0 Versions before SP3 U6 Build 25143 and all (?) NAV 5.0 Versions before U1 Build 25359.

You can see your database-version with this little SQL query:

SELECT "databaseversionno" FROM "$ndo$dbproperty"

Check out Waldo's Build No. Lists:

dynamicsuser.net/.../default.aspx

The details about the problem and the patches you'll find in "Partner Source". The bug is actually, that if you delete "zero-sum-records" (no matter if with some procedure or NAV's "Table Optimizer") the FlowFields might display corrupt data - the SUMs taken from the SIFT are wrong.

Saturday, May 29, 2010 3:52 PM by Jörg A. Stryk

# re: Optimizing SIFT and VSIFT

@Peter:

Yep, that's a real pity. I try to avoid adding keys by primarily adding SQL site indexes using INCLUDEd columns, which mostly helps ... but not always ...

Saturday, May 29, 2010 3:55 PM by Jörg A. Stryk

Leave a Comment

(required) 
(required) 
(optional)
(required)