Stephen Smith's Blog

Musings on Machine Learning…

Archive for June 2009

Porting Versus Integrating

with one comment

For Accpac we are a large ERP package with all the core accounting modules, and many third party ISV products all written in our SDK. With this approach all the applications have the same look and feel. They all work consistently and share many system services like VBA macros, import/export, data integrity checking, integrated security, shared signon, same database support, etc. Then Accpac also offers a number of powerful interfaces to integrate with other applications not written in our SDK. With this approach the other application and Accpac can exchange data, create source documents in each other’s system, and do all sorts of look ups and inquires. We have lots of such integrations with ISV products as well as other Sage products like Sage CRM, Abra and FAS.

If writing an application for Accpac from scratch, it makes sense to do it in the SDK for all the above advantages. If wanting to integrate with an existing product that runs standalone and has been developed for years, then it makes sense to do the integration route.

There then seem to be a number of more murky cases. For instance if we want an accounting module from another Sage accounting product, then does it make sense to integrate it or to re-implement it in our SDK using their design as our implementation guide? For instance if we wanted a module from Sage Timberline (like estimating or a more advanced PJC) then does it make sense to install both Accpac and Timberline and run the existing Accpac accounting modules and then run Timberline for the one module we want added to Accpac? Or does it make more sense to start with their developed database schema and re-implement it in the Accpac SDK.

Besides the straight implementation of the database, business logic, UIs and reports, then their is the further problem of possibly tight integrations with other accounting modules like Inventory Control, Order Entry, General Ledger, etc. Will we need to add functionality to these to support the new modules? Will we be able to un-entangle the module we want from all its integrations to other modules in Timberline?

As we move to Accpac 6 and our new web based user interface, we would want all our modules to share the same web based interface. Perhaps if the other product can present its data using SData (our REST web services interface), then we can put Orion UIs on that? But that still leaves many problems with the back end integration. Can SData be used to solve this as well, offering standard web based interfaces between the accounting modules as well as between the web server and the browser based UIs? Or would it be easier to just re-implement the whole thing in our SDK?

Anyway just a few interesting questions to consider as we move forwards.

Written by smist08

June 13, 2009 at 3:32 am

SQL Server, Performance and Clustered Indexes

with 2 comments

One of our engineers Ben Lu, was doing some analysis of why SQL Server was taking along time to find some dashboard type totals for statistical display. There was an index on the data, and their weren’t that many records that met the search criteria that needed adding up. It turned out that even though it was following an index to add these up, that the records were fairly evenly distributed through the whole table. This meant that to read the 1% or so of records that were required for the sum, SQL Server was actually having to read pretty much every data table in the table usually to get one record. This problem is a bit unique to getting total statistics or sums; if the data was fed to a report writer, the index would return enough records quickly to start the report going and then SQL Server can feed the records faster than a printer or report preview windows can handle. However in this case nothing displays till all the records are processed.

It turns out that SQL Server physically will store records in the order on disk or the index that is specified as a clustered index (and only 1 is required). In Accpac we designate the primary index as a clustered index as we found long ago that this speeds up most processing (since most processing is based off the primary index). However in this case the primary index was customer number, document number; and for the query were interested in processing all unpaid documents which are spread fairly evenly over the customers (at least in the customer database we were testing with).

We found that we could speed up this calculation quite a bit by added an additional segment to the beginning of the primary key. For instance without any changes, the query was taking 24 seconds on a large customer database. Putting the document date as the first segment reduced the query to 17 seconds (since hopefully unpaid invoices are near the end). However it turned out there were quite a few old unpaind invoices. Adding an “archive” segment to the beginning where we mark older paid invoices as archived reduced the query time to 4 seconds. We could also add the unpaid flag as the first index to get the same speed up. But were are thinking that perhaps the concept of a virtual archiving would have other applications other than this one query.

Anyway now that we understand what is going on, we can decide how to best address the problem. Whether adding the paid flag to the record, adding the date or adding an archive flag. Its interesting to see how SQL Server’s performance characteristics actually affect the schema design of your database, if you want to get best perforamance. And that often these choices go beyond just adding another index.

Written by smist08

June 7, 2009 at 9:05 pm