Stephen Smith's Blog

Musings on Machine Learning…

SQL Server, Performance and Clustered Indexes

with 2 comments

One of our engineers Ben Lu, was doing some analysis of why SQL Server was taking along time to find some dashboard type totals for statistical display. There was an index on the data, and their weren’t that many records that met the search criteria that needed adding up. It turned out that even though it was following an index to add these up, that the records were fairly evenly distributed through the whole table. This meant that to read the 1% or so of records that were required for the sum, SQL Server was actually having to read pretty much every data table in the table usually to get one record. This problem is a bit unique to getting total statistics or sums; if the data was fed to a report writer, the index would return enough records quickly to start the report going and then SQL Server can feed the records faster than a printer or report preview windows can handle. However in this case nothing displays till all the records are processed.

It turns out that SQL Server physically will store records in the order on disk or the index that is specified as a clustered index (and only 1 is required). In Accpac we designate the primary index as a clustered index as we found long ago that this speeds up most processing (since most processing is based off the primary index). However in this case the primary index was customer number, document number; and for the query were interested in processing all unpaid documents which are spread fairly evenly over the customers (at least in the customer database we were testing with).

We found that we could speed up this calculation quite a bit by added an additional segment to the beginning of the primary key. For instance without any changes, the query was taking 24 seconds on a large customer database. Putting the document date as the first segment reduced the query to 17 seconds (since hopefully unpaid invoices are near the end). However it turned out there were quite a few old unpaind invoices. Adding an “archive” segment to the beginning where we mark older paid invoices as archived reduced the query time to 4 seconds. We could also add the unpaid flag as the first index to get the same speed up. But were are thinking that perhaps the concept of a virtual archiving would have other applications other than this one query.

Anyway now that we understand what is going on, we can decide how to best address the problem. Whether adding the paid flag to the record, adding the date or adding an archive flag. Its interesting to see how SQL Server’s performance characteristics actually affect the schema design of your database, if you want to get best perforamance. And that often these choices go beyond just adding another index.

Written by smist08

June 7, 2009 at 9:05 pm

2 Responses

Subscribe to comments with RSS.

  1. Hi Smith,

    Myself Mohan Malode from India, in my organization we are using the SAGE ACCPAC 500ERP along with the HOSPITAL MANAGEMENT SYSTEM developed by the MIMSYS, we have problem with respect to the performance every now and then, DATA size is grown to 900GB for the past 7years. We have been asking MIMSYS for archiving data solution which they are not able to provide, probably they are dependent on the SAGE ACCPAC SOURCE CODE which is there principal code.

    Is there a way to archive the data, it would be of great help to us if you can provide the link.

    Awaiting your response on this.

    Mohan Malode

    Mohan Malode

    June 26, 2017 at 12:02 pm

    • I guess it depends whether the data is in their module or one of the Sage modules. If it’s Sage Order Entry then I think there are third party solutions to archive orders. If its in MIMSYS then they would need to provide a solution. Another option might be to clear some history if it’s older than you need. Generally though old data doesn’t slow things down because there should be good indexes on everything, so Sage 300 doesn’t need to scan the whole file.


      June 26, 2017 at 11:07 pm

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.

%d bloggers like this: