This session is driven by Chad Aberlen, one of the Microsoft Escalation Engineers. Most of his expertise comes from the GP side, but he assured this was not a GP session, but a general Dynamics session. Let's see .
He wanted to give an overview on how to approach performance tuning, key microsoft sql performance counters and detecting three common performance issues. I have seen similar session quite a few times, but this is another agenda for sure.
Why tune? Mostly because of the fact: time is money, so you can't lose user productivity. Also the cost of hardware is a reason for tuning. Furthermore, hardware may only temporarily resolve issues and higher return on investment as reasons to tune.
But where should we start? You should monitor the system. Try to isolate the problems. Is it only one bad query, or is it only one peace of functionality ... or may be a general problem that is going on? Hardware issue or software issue? So, what is causing the high pressure?
What are the hardware components that are restricting performance. We all know that:
How do we measure the disk subsystem:
Then, he went over some RAID configurations. Putting log files on RAID5, we all know that as well, he doesn't like that (and again, there were quite some people). Writing is slow in RAID5. Off course, it depends on the number of disks, but in general, it's too slow. RAID10 is best. Good speed, and very good write speed. Therefor, this is what he recommends:
avoid as generally not enough drives
acceptiable if low percentage of writes (which is seldom the case in NAV, so let's try to avoid!)
avoid because of high cost of write I/O
best performance and use if RAID 1 shows pressure
Master / MSDB
Not really new things here, but it's always nice that it's confirmed by someone that hasn't been working on NAV at all .
Now, what could be causing the I/O cost on the disks? This could be caused by memory pressure, or excessive paging (also due to too low memory) or poorly designed queries (scans on large tables, missing key indexes, ...), high write I/O's to a RAID 5 volume, high usage during peak times, ... . So it could be hardware, or software.
He continued with a demo. He showed the performance on his laptop, performing a query that took some time, and generated a scan on a very large table, which caused a huge disk activity.
Next, how do we measure the CPU?
So, what causes CPU bottelnecks? Compiles/recompiles of execution plans, hash joins, aggregate functions, data sorting, disk I/O activity (paging), other applications/services, screen savers, ... .
Time for another demo. He showed the above counters in the performance monitor. Executes a script with a mix of creating a table and inserting data into it. It resulted in a pretty busy CPU. The second time, the script was somewhat changed (with the same result at the end): first creating all tables, then inserting the data into the tables. The performance was remarkably better!
What about the memory?
He somewhat skipped the network counters, because it was nog really important to him. Too bad, because I was interested in it.
SQL Server specific counters:
He really doesn't like deadlocks. When he sees a deadlock, he wants to know what caused, what peace of functionality, when, who, ... . It can result in bad data, so let's try to avoid it.
Next, a demo about SQL blocking. He just simulated someone changing data and therefor locking a set of records, while someone else also tried to change the same record. The "Lock wait time" showed how long someone was bloced as one pike of (in this case) 60 seconds.
These should be the top three performance issues
To conclude, he talked about an unsupported inhouse tool that multiple teams within microsoft were using to log some of the performance issues ... Which is nice to see as a "benchmark" or tool that can point you in a right direction. Take a look at http://www.codeplex.com/PAL for more information. Use and abuse. Again, this tool is not supported ... It's just a tip.
That's all, folks .