Stephen Smith's Blog

Musings on Machine Learning…

Posts Tagged ‘database transaction

Performance and the Sage 300 Views Part 2

with 2 comments

Introduction

Last week we discussed avoiding table scans when using the Sage 300 ERP APIs. This week we are going to look at some other issues to do with updating data and with processing meta-data.

Last week I showed a cheetah running as an example of performance and speed (the fastest land animal), but this week here she is resting and getting some attention.

cheetah3

AOM/UI Info/ViewDoc

First, just if you are wondering where to find out what indexes a View supports, there are quite a few tools to determine this. Plus you can always look in SQL Management studio, but then you won’t know which index it is by our numbering scheme. Anyway ViewDoc is a good tool that comes with the SDK that gives this information. UI Info comes with System Manager and can drill down through the UI Info to get detailed View Info. Then there is the Sage 300 Application Object Model (AOM) located here. Just note that to use the AOM, you must use Internet Explorer for some obscure reason.

Updating Data

Often if you are manipulating lots of records it’s in a header/detail situation. In this case all the database operations are done when you insert or update the header. The nice things about this is that the Views know a lot about our database API and will do this in an optimal manner so you don’t need to worry about it. Similarly if you delete a header, the View will delete all attendant details for you in an efficient manner.

But suppose you want to update a bunch of records using our .Net API and want to know the most efficient way to do this. Say we want to add something to the end of every A/R Customer Name. Our easy brute force way to do this would be:

arCUS.RecordClear();
while (arCUS.Fetch(false))
{
arCUS.Fields.FieldByName(“NAMECUST”).SetValue(
arCUS.Fields.FieldByName(“NAMECUST”).Value + “A”, false);
arCUS.Update();
}

 

This works but you might find it a bit slow. We can speed it up quite a bit by bracketing the whole thing in a database transaction:

mDBLinkCmpRW.TransactionBegin();
arCUS.RecordClear();
while (arCUS.Fetch(true))
{
arCUS.Fields.FieldByName(“NAMECUST”).SetValue(
arCUS.Fields.FieldByName(“NAMECUST”).Value + “A”, false);
arCUS.Update();
}
mDBLinkCmpRW.TransactionCommit();

 

The times from the sample program (the same one as last week but with a bit added) is:

Time to update all customers: 00:00:00.087
Time to update all customers in a transaction: 00:00:00.038

So putting things in a database transaction helped. This is for Sample Data so there are only a few customers. The updated sample program is the PerformanceSamples project located here (both folder and zip file).

Database Transactions

Generally when using our API you don’t need to worry about database transactions, but occasionally like in the above example they are necessary. In the above example the first method has the side effect that each update is done in a separate transaction. That means you have the overhead of starting and committing a transaction with every record update. In the second example we start a transaction so all the records are committed as a single transaction. Strictly speaking the two examples don’t do the same things, if the first example throws an exception part way through then all the updates done up to that point will be in the database, whereas in the second example they will be discarded since the transaction will be rolled back. This difference can be quite important if there are database integrity issues to consider. Generally Sage 300 ERP uses transactions to go from one state where the database has full integrity to another. This way we can rely on database transactioning to always maintain full database integrity.

There is overhead to setting up and committing a transaction, but there are also resources used for every operation done inside a transaction. At some point the above example will start to slow down if you have too many A/R customers. Generally you might want to commit the transaction every thousand customers or so for optimal performance (but make sure you maintain database integrity along the way).

Also keep in mind that while records are updated in a transaction they will become locked from the point of update through to the end of the transaction, so updating a lot of records in a transaction will lock a lot of records and cause anyone else going to read that record to have to wait until your transaction completes. So try to keep transactions quick. Definitely don’t do any UI type operations in the middle of a transaction (like asking the user a question).

Revisioned Views

Revision List type views will store all insert/updates/deletes in memory until you call Post. Generally these are detail views and you don’t see this functionality because it’s handled by the header. But occasionally you may need to deal with one of these (like perhaps GLAFS). In this case since each Post is a transaction, you just need to be aware of how often you call it as this will have the same effect on performance as mentioned above.

Deleting

Although you can delete records as above just replacing the Update with a Delete call, there is a better way. The Views have a FilterDelete method where you pass in a browse filter and all the records that match will be deleted. This will prove to be quite a bit faster than the above.

Meta-Data

If you run RVSpy with all the View calls selected you will see a lot of meta-data calls, getting information on fields and such. Generally meta-data calls are quite fast and don’t involve going to the database. However if you really go crazy you can slow things down quite a bit. If you make everything dynamic then you could end up making lots of meta-data calls and cumulatively these slow you down a bit. Similarly using constants in things like getting fields are slightly faster than passing field names because you avoid a dictionary lookup (admittedly quite fast but not as fast as direct access). Mostly people exercise good judgement and don’t go too wild driving everything from meta-data, but we have seen some crazy cases.

Summary

Just a quick overview of some performance tips. Hopefully these all help to make your use of the Sage 300 API more efficient.

 

Sage 300 ERP – Data Integrity

with 4 comments

Introduction

Modern ERP systems maintain a company’s full financial history for many years. People want to be confident that all that data is correct and makes sense. So how can you be confident that your database has full referential integrity? Especially after years and years of operation. The Sage 300 ERP Data Integrity function is a way to validate the integrity of a database. Modern computers are much more reliable these days than when our Data Integrity function was originally written, but it still serves a good purpose. In this article we will explore some of the protections to protect data integrity in Sage 300 along with some of the possible causes of corruption.

Database Transactioning

The number one protection of data integrity in Sage 300 is database transactioning. Data is always written to the database in a database transaction. Database transactions always take the database from one state with full data integrity to the next state with data integrity. A database transaction is guaranteed by the database server to either be all written to the physical database or none of it is written, the database server guarantees that you will never see part of a transaction.

For instance as we post a G/L batch, we post each entry as a database transaction and since each entry in a G/L batch must be balanced we guarantee via database transactioning that the G/L is always in balance and hence data integrity is maintained.

Where Do Integrity Errors Come From?

Database transactioning sounds great, and in fact with database transactioning we see very few database problems in Sage 300 ERP. But when we do get integrity problems where do data integrity errors come from?

Below is a list of some of the main causes of data integrity problems. I’m sure there are more. I’m not looking to blame anyone (including myself), just to point out the main causes I’ve seen:

  • Bugs in the program. If Sage 300 asks SQL Server to store incorrect data, it will do so in a completely reliable transactional manner. Hopefully our QA processes catch most of these and this doesn’t happen often; but, Sage 300 is a large complicated program and mistakes happen.
  • People editing database tables directly in SQL Server Enterprise Manager. For various reasons people might try to put something in the database that the program doesn’t allow, and often this leads to database corruption.
  • Third party programs that write to the Sage 300 database directly. We do a lot of data validation checking in our business logic before allowing data to be written to the database, but if this is bypassed then corruption occurs. A common one in this case is not handling currency decimal places correctly.
  • Data validation needs to be tightened. Now and again, someone has written data that we accepted as valid that wasn’t. Then we had to tighten our data validation routines. The good news here is that we’ve been doing this for a long time now.
  • Bug in the Database Server. We’ve seen database indexes get corrupted which can lead to further problems either after the indexes are fixed (because of other data written as a result).
  • Partial backups or restores. We’ve seen people back up the tables for each application independently and then restore them. Perhaps to try to put A/R back to yesterday. But this corrupts the database since there is often matching data that needs to be in sync in Bank, Taxes or perhaps Order Entry. Make sure you always backup and restore the database as a whole.
  • Hardware glitches. Even with CRC checking and such, strange errors can start to appear from hard disk or memory hardware failures in computers.

The Data Integrity Checker

To find these sort of problems Sage 300 ERP has a data integrity checker in its Administrative Services. The main screen looks like:

dataint1

You select the applications you want to check and whether you want to fix any minor errors. Since this can be a long process, for several applications you can also configure which parts within the application to check by selecting the application and choosing the application options in a screen like:

dataint2

The end result is a report is run that lists all the errors found.

What Does the Integrity Checker Look For?

So what does the Integrity Checker do? Below is a list of some of the checks that are typically made:

  • Check the integrity of each file reading each record and calling the View Verify API which will call the business logic to validate the record. This includes things like checking the decimals of a money amount are correct, that the data is correct for the data type, that foreign keys are valid.
  • For Header/Detail type relationships there are often total or summary fields in the header like the total amount of an order or the number of detail lines. The integrity checker will read through the details and add up any of these numbers to ensure they match the header.
  • Check the database for any detail records that don’t have a matching header record (orphans).
  • Each application then knows about all sorts of cross file relationships that must be maintained and the Integrity Checker for that application will validate all of these relationships.

What Does Fix Minor Errors Do?

There is the check box to fix minor errors, but what does it do? Mostly it fixes up header/detail relationships by fixing any total or summary fields in header records. It can also delete orphan detail records. But generally it doesn’t attempt much because we don’t want to risk making things worse.

But it’s Slow

The big complaint about the Data Integrity checker is that it’s slow. This is because it does go through every record in the database as well as checking all the cross dependencies. These days we see company databases that are hundreds of gigabytes in size. Generally the complaint is that you can’t just run it as routine maintenance overnight. That you tend to have to configure what you want to run and do that selectively. It’s also best to run it when people aren’t in the system since it does put a fair bit of load on the system.

But this does open up an opportunity for third party developers. Companies like Tairox and Orchid offer solutions to automate data integrity or to run it as a service.

Summary

Even with super reliable modern databases and hardware, data integrity errors can still creep in and need to be dealt with. Just being aware they exist is half the battle. Also remember that it is extremely important to have regular full backups of your data in case of a really catastrophic failure.