Stephen Smith's Blog

Musings on Machine Learning…

Archive for March 2015

Remembering Leonard Nimoy

with 2 comments

Introduction

The original Star Trek came out in 1966 when I was only six years old. It only ran for three seasons but had a great influence on so many people. Leonard Nimoy lived a long life and did many things; but, to many of us he is still and will always be Mr. Spock. I think I spent my entire childhood watching the original Star Trek, first the original series and then over and over again in repeats.

I’m amazed that for a series that only ran for three seasons for a total of 79 episodes (I guess television series had more episodes back in the sixties). The series covered a lot of topics and became firmly embedded in pop culture. Leonard Nimoy as an actor, even with very little make up certainly came across as rather alien (maybe more so than some of the modern CGI aliens). His character Spock introduced us to a rather rich character with logic, telepathy, the Vulcan nerve pinch and now and then getting quite emotional. After all Mr. Spock is half Vulcan and half human and the battle between his human and Vulcan sides is one of the things that makes his character quite interesting.

leonard-nimoy-flare-640

Logic

Mr. Spock is best known for his pursuit of logic, suppressing all emotion in order to become a purely logical being. This was probably greatly annoying to a lot of parents who now had to put up with kids always pointing out that most things are not logical. On the other hand I think it was a great way to promote critical thinking which is a skill that often seems quite lacking in society in general.

For people looking to fit into a more and more complicated world that is changing faster and faster, I think you could do far worse than adopting a more Vulcan logical approach to life. If you are worried about being conned or deceived often stepping back and applying some logic can be a great way to analyze things without getting caught up in the moment.

Of course one thing to remember is that logic is a way of making deductions based on a set of assumptions. If you don’t start with good assumptions then purely logical reasoning will lead to rather crazy conclusions. This is where critical thinking and science come in. If you start from good solid assumptions then logic will take you far. If you start with ridiculous assumptions then you get the logic often put forward by various politicians and others looking to manipulate you for their own purposes. A key takeaway for useful logic is to always question and refine your assumptions. Mathematics is based on this. Most Mathematics starts with a small set of assumptions or axioms and then builds a theory using logic and mathematical proofs to determine where they lead. For instance Euclid created his Geometry on a few axioms like “It is possible to draw a straight line from any point to any other point.” And from these proved many useful Geometric theorems that we all learn in school today.

Mathematics tends to be a mental exercise where a practical use isn’t necessarily the goal. Science uses mathematics, but everything has to be tested against the physical world. So if you start with a set of axioms and get a theory, but experiment shows some part is wrong, then one of your axioms is wrong and needs to be fixed. Science tends to be very demanding in this regard as Spock would often point out to Kirk.

MrSpockSays

Science Officer

Mr. Spock was both the second in command and the science officer for the Enterprise. As science officer Spock could apply his logic in a scientific context that was quite inspirational to a generation of budding scientists. He then had use of the ship’s scanners and his trusty tricorder when on landing parties.

One of the Enterprise’s primary missions was scientific discovery. The series was created around the same time as the Apollo moon program was in full gear. The assumption here was that the moon would be the first step and we would continue on to the planets of the solar system and eventually the nearby star systems. As it turns out we gave up on pursuing space and it only seems like we are starting to get interested in it again now nearly 50 years later.

star trek Tricorder 08

Computers

Spock was also the computer expert on the Enterprise and the common use of computers in the Star Trek series also planted the seed for many budding computer scientists. It’s interesting that the people creating the original series were most concerned that their computer technology was the most far out there thing on the series. They were convinced that we would have things like faster than light star ships and transporters far sooner than we would have talking computers that you could ask anything and get an instant intelligent answer. Of course now we have exceeded the computer technology in the original series, but warp drive still isn’t on the horizon.

There were quite a few episodes concerned with completely logical machines who often wanted to wipe out us illogical biological beings in some sort of pursuit of perfection. These episodes would often showcase the relationship between logical Spock and emotional Kirk. Asking the question of what makes us human, our ability to think, reason and be scientific and logical versus our emotional intuitive side? Usually coming to the conclusion that you needed both and that balance is required.

Star-Trek-The-Original-Series-Desktop-Computer-3

Summary

I’m sure we’ll see many more Vulcans and new Mr. Spocks as Hollywood reboots this series every now and then. But to me the real Mr. Spock is Leonard Nimoy and he will be missed.

Leonard_Nimoy_William_Shatner_Star_Trek_1968

Advertisements

Written by smist08

March 22, 2015 at 2:51 am

Performance and the Sage 300 Views Part 2

with 2 comments

Introduction

Last week we discussed avoiding table scans when using the Sage 300 ERP APIs. This week we are going to look at some other issues to do with updating data and with processing meta-data.

Last week I showed a cheetah running as an example of performance and speed (the fastest land animal), but this week here she is resting and getting some attention.

cheetah3

AOM/UI Info/ViewDoc

First, just if you are wondering where to find out what indexes a View supports, there are quite a few tools to determine this. Plus you can always look in SQL Management studio, but then you won’t know which index it is by our numbering scheme. Anyway ViewDoc is a good tool that comes with the SDK that gives this information. UI Info comes with System Manager and can drill down through the UI Info to get detailed View Info. Then there is the Sage 300 Application Object Model (AOM) located here. Just note that to use the AOM, you must use Internet Explorer for some obscure reason.

Updating Data

Often if you are manipulating lots of records it’s in a header/detail situation. In this case all the database operations are done when you insert or update the header. The nice things about this is that the Views know a lot about our database API and will do this in an optimal manner so you don’t need to worry about it. Similarly if you delete a header, the View will delete all attendant details for you in an efficient manner.

But suppose you want to update a bunch of records using our .Net API and want to know the most efficient way to do this. Say we want to add something to the end of every A/R Customer Name. Our easy brute force way to do this would be:

arCUS.RecordClear();
while (arCUS.Fetch(false))
{
arCUS.Fields.FieldByName(“NAMECUST”).SetValue(
arCUS.Fields.FieldByName(“NAMECUST”).Value + “A”, false);
arCUS.Update();
}

 

This works but you might find it a bit slow. We can speed it up quite a bit by bracketing the whole thing in a database transaction:

mDBLinkCmpRW.TransactionBegin();
arCUS.RecordClear();
while (arCUS.Fetch(true))
{
arCUS.Fields.FieldByName(“NAMECUST”).SetValue(
arCUS.Fields.FieldByName(“NAMECUST”).Value + “A”, false);
arCUS.Update();
}
mDBLinkCmpRW.TransactionCommit();

 

The times from the sample program (the same one as last week but with a bit added) is:

Time to update all customers: 00:00:00.087
Time to update all customers in a transaction: 00:00:00.038

So putting things in a database transaction helped. This is for Sample Data so there are only a few customers. The updated sample program is the PerformanceSamples project located here (both folder and zip file).

Database Transactions

Generally when using our API you don’t need to worry about database transactions, but occasionally like in the above example they are necessary. In the above example the first method has the side effect that each update is done in a separate transaction. That means you have the overhead of starting and committing a transaction with every record update. In the second example we start a transaction so all the records are committed as a single transaction. Strictly speaking the two examples don’t do the same things, if the first example throws an exception part way through then all the updates done up to that point will be in the database, whereas in the second example they will be discarded since the transaction will be rolled back. This difference can be quite important if there are database integrity issues to consider. Generally Sage 300 ERP uses transactions to go from one state where the database has full integrity to another. This way we can rely on database transactioning to always maintain full database integrity.

There is overhead to setting up and committing a transaction, but there are also resources used for every operation done inside a transaction. At some point the above example will start to slow down if you have too many A/R customers. Generally you might want to commit the transaction every thousand customers or so for optimal performance (but make sure you maintain database integrity along the way).

Also keep in mind that while records are updated in a transaction they will become locked from the point of update through to the end of the transaction, so updating a lot of records in a transaction will lock a lot of records and cause anyone else going to read that record to have to wait until your transaction completes. So try to keep transactions quick. Definitely don’t do any UI type operations in the middle of a transaction (like asking the user a question).

Revisioned Views

Revision List type views will store all insert/updates/deletes in memory until you call Post. Generally these are detail views and you don’t see this functionality because it’s handled by the header. But occasionally you may need to deal with one of these (like perhaps GLAFS). In this case since each Post is a transaction, you just need to be aware of how often you call it as this will have the same effect on performance as mentioned above.

Deleting

Although you can delete records as above just replacing the Update with a Delete call, there is a better way. The Views have a FilterDelete method where you pass in a browse filter and all the records that match will be deleted. This will prove to be quite a bit faster than the above.

Meta-Data

If you run RVSpy with all the View calls selected you will see a lot of meta-data calls, getting information on fields and such. Generally meta-data calls are quite fast and don’t involve going to the database. However if you really go crazy you can slow things down quite a bit. If you make everything dynamic then you could end up making lots of meta-data calls and cumulatively these slow you down a bit. Similarly using constants in things like getting fields are slightly faster than passing field names because you avoid a dictionary lookup (admittedly quite fast but not as fast as direct access). Mostly people exercise good judgement and don’t go too wild driving everything from meta-data, but we have seen some crazy cases.

Summary

Just a quick overview of some performance tips. Hopefully these all help to make your use of the Sage 300 API more efficient.

 

Performance and the Sage 300 Views Part 1

with 5 comments

Introduction

The Sage 300 ERP Views (Business Logic) give you a great deal of power to perform Accounting operations through our various APIs. However as in any programming, performance must always be taken into account. The Sage 300 ERP Views have a lot of features to help you perform operations with good performance, but like anything if they are used incorrectly, performance can be miserable.

This article is going to talk about various features and options that you can take advantage of to improve your application’s performance. As I am writing the article, it’s getting quite long, so I think I’m going to break it into two parts.

cheetah2

Measure and Test

One of the big mistakes people make when performance tuning, is to just make assumptions and changes without doing real measurements. If you have your code in a source control system, first establish a base line for how long something takes, then make you changes and re-measure the time. Only check in your changes if the time is faster, if it isn’t then you are just churning your code and potentially adding bugs. Performance is subtle and often the best ideas and intentions just make a process slower.

Multi-User versus Single-User Performance

This article is about optimizing processes for single users. Often if you want to optimize better multi-user throughput then it’s all about reducing locks and keeping resource usage down. Sometimes these goals align, i.e. 1 person doing something quicker translates to 100 people doing things quicker, sometime they are opposing, i.e. one person can do something way quicker if he takes over all available resources at the detriment to everyone else.

Read-Only versus Read-Write

You can open our database links and views either in read-write mode or read-only mode. Generally if you aren’t updating the data then you want to open in read-only mode as this makes things quite a bit faster. If you might update the data then we have to use more expensive SQL operations so that if you do update the data, the update is fast and multi-user considerations are handled. If you open a table or link read-only then we use much lighter weight SQL operations and the data is returned much quicker. Finders use this to display their data quicker.

FilterSelect/FilterFetch versus Browse/Fetch

When you Browse/Fetch you can always update or delete the record fetched. As mentioned above that can introduce extra overhead and slow things down. Making the table or link read-only will help Browse/Fetch, but perhaps a better method is to use the FilterSelect/FilterFetch methods which are better optimized for SQL Server than Browse/Fetch. The results from these can’t be updated or deleted but at the same time the access method is always light weight whether the link is open read-only or read-write.

Indexes

Sage 300 will always use an index to read data. We have a lot of code to optimize access based on available indexes. If you use the indexes provided your code will be much faster.

For example, suppose you want to know if there are any open G/L Batches. A quick bit of code to do this is:

glBCTL.Browse(“BATCHSTAT=1”, true);
bool isOpenBatch = glBCTL.GoTop();

This works pretty good on sample data, but then you go to a client, suddenly this becomes quite slow. The reason is that since BATCHSTAT isn’t part of the primary index, the GoTop basically goes looking through the Batch table until it reaches the end or finds an open batch. Since open batches are usually at the end, this tends to be sub-optimal. Practically you could speed this up by searching through the table backwards since then you would probably find one quicker, but if there are no open batches you still search the whole table. Fortunately there is a better way. The GLBCTL table has two indexes, one is its primary default index of BATCHID and the other secondary index is on BATCHSTAT and BATCHID (to make it an index without duplicates). So it makes sense to use this index:

glBCTL.Order = 1;
glBCTL.Browse(“BATCHSTAT=1”, true);
isOpenBatch = glBCTL.GoTop();

Simple adding the Order property makes this search much quicker. I included a sample program with timers and the full code. The results on sample data show the speed difference (not that it was all that slow to start with):

Time to determine if there are open batches: 00:00:00.034
Time to determine if there are open batches take 2: 00:00:00.007

The sample program is located here. Its PerformanceSamples one (folder and zip).

So generally you want to use an index that matches the fields that you are searching on as much as possible. Usually having clauses in your browse filter that uses the index segments from left to right will result in the fastest queries.

This example may look a little artificial, but once you get into the operational modules like O/E and P/O this becomes crucial. That is because the main tables like the Order Header have a uniquifier as the primary index. When you want to look something up it’s usually by something like order number and to do this efficiently you have to use an alternate index. So once you are using these modules you will be using alternate indexes a lot. In these modules also be careful that quite a few alternate indexes allow duplicates, so you might get back quite few records unexpectedly.

RVSpy/DBSpy

RVSpy and DBSpy are good tools for identifying bad behavior. The logs contain time information so you can see where the time is being used, but more often than not doing something bad for performance results in a series of operations appearing over and over in these logs. Usually scrolling to the middle of the output file is a good way to see something going awry. You can also use SQLTrace or ODBCTrace, but I find these slightly less useful.

When using RVSpy for this purpose, it helps to turn off logging to a Window (slow) and only log to a file (make sure you specify one). Further choose the View calls you want to log, usually disabling anything to do with meta-data and anything that is field level.

So if you see output like:

[5b8.7ff.37b0] CS0003: CSCCD    [01:12:06.58].Fetch(view=0x2F1047AC)
[5b8.7ff.37b0] 0 <==[01:12:06.58;t=0;ovh=0] {}
[5b8.7ff.37b0] CS0003: CSCCD    [01:12:06.58].Fetch(view=0x2F1047AC)
[5b8.7ff.37b0] 0 <==[01:12:06.59;t=0;ovh=0] {}
[5b8.7ff.37b0] CS0003: CSCCD    [01:12:06.59].Fetch(view=0x2F1047AC)
[5b8.7ff.37b0] 0 <==[01:12:06.59;t=0;ovh=0] {}
[5b8.7ff.37b0] CS0003: CSCCD    [01:12:06.59].Fetch(view=0x2F1047AC)
[5b8.7ff.37b0] 0 <==[01:12:06.60;t=0;ovh=0] {}
[5b8.7ff.37b0] CS0003: CSCCD    [01:12:06.60].Fetch(view=0x2F1047AC)
[5b8.7ff.37b0] 0 <==[01:12:06.60;t=0;ovh=0] {}

Going on for pages and pages then you have something wrong.

Avoid Table Scans

Most of this article is about avoiding table scans, but just to re-iterate table scans are bad. People are often fooled by testing on sample data. Many of the tables in sample data are quite small and it doesn’t really matter what you do. However in the real world with real customer databases things will usually be quite different. For instance sample data has 9 tax authorities, which you might think is reasonable. But in the USA where any municipal government agency can charge a sales tax, there are over 35,000 tax authorities. If you read all these (like to populate a combo-box to pick one from), then you will run very slowly and your customers will be unhappy.

Summary

Sage 300 ERP has many mechanisms to access and manipulate data efficiently. But as with anything in programming, if you use APIs without due care and attention then performance (and quality in general) will suffer.

Written by smist08

March 10, 2015 at 9:44 pm