Archive for the ‘Software Architecture’ Category
At Sage Summit 2015 we introduced our new Web UIs for Sage 300. I’ve blogged a bit on the various user facing parts and a bit on the technologies used, but I haven’t had a chance yet to get into the internals of how they work. We’ve released the Web UIs for the Financial Modules and will be releasing the Operations Modules early in 2016. With these we will be releasing our SDK as well. Over time I’ll blog quite a bit about the details of all the components, but first we need to layout the major building blocks. Below is the architecture diagram I’ve shared before which is the architecture for the main Web UIs and a block for RESTful Web APIs, but there are some other components that need mentioning as well.
I’m going to start at the bottom of this diagram and work up. However there are some other components that aren’t mentioned here that we will bring in.
Within our framework we provide a lot base classes that you can inherit from, so generally the only code you need to provide is where something is different than the standard protocols or behaviors. When looking for framework components to help you, besides looking for services to call, look for classes that do 90% of what you need that you can extend (inherit from). ASP.Net MVC also makes extensive use of “convention over configuration”, so for a lot of things if you follow the standards then it saves you a lot of work and makes a number of things just work magically. Similarly we do the same things, so we can find and use your components as long as you follow the standards as outlined.
The Sage 300 Views are still the same. We didn’t even update the C View template for supporting this new framework. All the regular Sage 300 Business Logic (Views) is accessed through our .Net API.
The Business Repositories convert our View API into something more like an ASP.Net MVC API. The Business Repositories are similar to the definitions that we generate from ViewDoc for different systems to make accessing all the various View fields more natural in the given framework. They provide support for all the usual View UI related items like presentation lists and field masks. Generally this is where all the View programming is placed that is needed by the UI. This makes all this programming available to everyone including UIs, standard components and Web APIs.
The models are true ASP.Net MVC Models. They are typically built on a given business repository. The reason for the separation between the Business Repositories and the Models is that the Business Repositories are more general use and can easily be consumed by UI elements whose Models are more generic like for Finders or Import/Export.
State-full Versus Stateless
There are two basic ways that our UIs operate, stateless and state-full. Ideally everything should be stateless, but this becomes somewhat impractical when dealing with large accounting documents. Basically in stateless operation, each RPC operation is completely independent and can be processed with no knowledge of what’s gone before. For state-full transaction a document is built up in the Sage 300 Business Logic as the user enters the document, this is more similar to how the VB UIs work. Generally only the bigger document entry UIs are state-full now. Use of the two is similar and most of the details are hidden by which base class you inherit from (whether a state-full or stateless one), but you must beware of the context as you do your programming.
Often in VB programming, you handle events from the user and as a result of a UI event you do a certain amount of View calls (perhaps say 20 of these). This sort of thing also happens in the Web UI, but we certainly can’t do 20 or so RPC calls to the server as a result of a single UI interaction. This is where services come in. Then when such a UI event is triggered, a single RPC call is made to the server, the controller dispatches this to the service who calls the correct method in the Business Repository which then makes the 20 or so View calls to do the real work. The service then marshals any returned data in the response. Of course this all happens asynchronously. Generally if you are wondering where to move your processing code from a VB UI it would be into one of the Business Repositories and then wire up a service to allow it to be called from the Browser. Services are also used to initiate long running processes which we’ll talk about later.
We don’t want to run long running processes like I/C Day End or Posting Batches directly from IIS, since even with full multi-threading support this can adversely affect the responsiveness of the UIs for other users. So we run a Windows Service and when a long running process is initiated we ask the Windows Service to run it, and just return from IIS. Then the UI can inquire periodically for meter status updates and to notify the user when it’s done. There is a lot of standard framework support for doing this and you just need to setup a service to initiate and monitor the process. Generally we do this for any Sage 300 process that pops up a progress meter.
KPIs are really just like any other UI. They just have a different size and are run in a different place on the Home Page. There are a few standards to follow to match the other KPIs and a few UI controls that aren’t used anywhere else, but are otherwise just standard UIs.
Although this component isn’t included in the original Sage 300 2016 release, there is full support for exposing RESTful Web Service APIs. These leverage your Model code to expose the model as an API. Then there is some support for hiding inappropriate methods and adding a bit more attribute information to help users.
Like previous versions of Sage 300, we use Crystal Reports to print out our reports. We basically use the same report API as we used in VB. We then have support to call this API from our UI framework and to render the report in the Crystal HTML Viewer. Then the rest of the report UI is just the same as any other UI, gathering up data from input fields. Printing might also have to initiate a process on the Windows service before printing, like generating the data for an aging report, before initiating the report.
Other System Components
There are system components for things like Finders and Import Export. These just need to be setup and wired in correctly. Then there are components like the editable grid which requires a certain amount more support in your UI. And then higher level components like the Optional Fields Control that is built on the Grid. There are a few other controls and wrappers for things like dates, fiscal year/periods, masked edit control, drop down list based on presentation strings, etc.
To be useful, these KPIs can involve quite sophisticated calculations to display relevant information. However users need to have their home page start extremely quickly so they can get on with their work. This article describes various techniques to calculate and present this information quickly. Starting with easy straight forward approaches progressing into more sophisticated methods utilizing the power of the cloud.
The simplest way to program such a KPI is to leverage any existing calculations (or business logic) in the application and use that to retrieve the data. In the case of Sage 300 ERP this involves using the business logic Views which we’ve discussed in quite a few blog posts.
This usually gives a quick way to get something working, but often doesn’t exactly match what is required or is a bit slow to display.
Last week, we looked a bit at using the Sage 300 ERP .Net API to do a general SQL Query which could be used to optimize calculating a KPI. In this case you could construct a SQL statement to do exactly what you need and optimize it nicely in SQL Management Studio. In some cases this will be much faster than the Sage 300 Views, in some cases it won’t be if the business logic already does this.
Often KPIs are just sums or consolidations of lots of data. You cloud maintain the KPIs as you generate the data. So for each new batch posted, the KPI values are stored in another table and incrementally updated. Often KPIs are generated from statistics that are maintained as other operations are run. This is a good optimization approach but lacks flexibility since to customize it you need to change the business logic. Plus the more data that needs to be updated during posting will slow down the posting process, annoying the person doing posting.
As a next step you could cache the calculated values, so if the user has already executed a KPI once today then cache the value, so if they exit the program and then re-enter it then the KPIs can be quickly drawn by retrieving the values from the cache with no SQL or other calculations required.
For a web application like the Sage 300 Portal, rather than cache the data retrieved from the database or calculated, usually it would cache the JSON or XML data returned from the web service call that asked for the data. So when the web page for the KPI makes a request to the server, the cache just gives it the data to return to the browser, no formatting, calculation or anything else required.
Often if the cache lasts one day that is good enough, there can be a manual refresh button to get it recalculated, but mostly the user just needs to wait for the calculation once a day and then things are instant.
In the cloud, it’s quite easy to create virtual machines to run computations for you. It’s also quite easy to take advantage of various Big Data databases for storing large amounts of data (these are often referred to as NoSQL databases).
Cloud applications usually don’t calculate things when you ask for them. For instance when you do a Google search, it doesn’t really search anything, it looks up your search in a Big Data database, basically doing a database read that returns the HTML to display. The searching is actually done in the background by agents (or spiders) that are always running, searching the web and adding the data to the Big Data database.
In the cloud it’s pretty common to have lots or running processes that are just calculating things on the off chance someone will ask for it.
So in the above example there could be a process running at night the checks each user’s KPI settings and performs the calculation putting the data in the cache, so that the user gets the data instantly first thing in the morning, and unless they hit the manual refresh button, never wait for any calculations to be performed.
That helps things quite a bit but the user still needs to wait for a SQL query or calculation if they change the settings for their KPI or hits the manual refresh button. A sample KPI configuration screen from Sage 300 is:
As you can see from this example there are quite a few different configuration options, but in some sense not a truly rediculous number.
I’ve mentioned “Big Data” a few times in this article but so far all we’ve talked about is caching a few bits of data, but really the number of these being cached won’t be a very large number. Now suppose we calculate all possible values for this setup screen. Use the distributed computing powe of the cloud to do the calculations and then store all the possibilities in a “Big Data” database. This is much larger than we talked about previously, but we are barely scratching the surface of what these databases are meant to handle.
We are using the core functionality of the Big Data database, we are doing reads based on the inputs and returning the JSON (or XML or HTML) to display in the widget. As our cloud grows and we add more and more customers, the number of these will increase greatly, but the Big Data database will just scale out using more and more servers to perform the work based on the current workload.
Then you can let these run all the time, so the values keep getting updated and even the refresh button (if you bother to keep it), will just get a new value from the Big Data cache. So a SQL query or other calculation is never triggered by a user action ever.
This is the spider/read model. Another would be to sync the application’s SQL database to a NoSQL database that then calculates the KPIs using MapReduce calculations. But this approach tends to be quite inflexible. However it can work if the sync’ing and transformation of the database solves a large number of queries at once. Creating such a database in a manner than the MapReduce queries all run fast is a rather nontrivial undertaking and runs the risk that in the end the MapReduces take too long to calculate. The two methods could also be combined, phase one would be to sync into the NoSQL database, then the spider processes calculate the caches doing the KPI calculations as MapReduce jobs.
This is all a lot of work and a lot of setup, but once in the cloud the customer doesn’t need to worry about any of this, just the vendor and with modern PaaS deployments this can all be automated and scaled easily once its setup correctly (which is a fair amount of work).
There are lots of techniques to produce/calculate business KPIs quickly. All these techniques are great, but if you have a cloud solution and you want its opening page to display in less that a second, you need more. This is where the power of the cloud can come in to pre-calculate everything so you never need to wait.
We’ve been working on an interesting POC recently that involved Google like search. After evaluating a few alternatives we chose Elastic Search. Search is an interesting topic, often associated with Big Data, NoSQL and all sorts of other interesting technologies. In this article I’m going to look at a few interesting aspects of Elastic Search and how you might use it.
Elastic Search is an open source search product based on Apache Lucene. It’s all written in Java and installing it is just a matter of copying it to a directory and then as long as you already have Java installed, it’s ready to go. An alternative to Elastic Search is Apache Solr which is also based on Lucene. Both are quite good, but we preferred Elastic Search since it seemed to have added quite a bit of functionality beyond what Solr offers.
Elastic search is basically a way of searching JSON documents. It has a RESTful API and you use this to add JSON documents to be indexed and searched. Sounds simple, but how is this helpful for business applications? There is a plugin that allows you to connect to databases via JDBC and to setup these to be imported and indexed on a regular schedule. This allows you to perform Google like searches on your database, only it isn’t really searching your database, its searching the data it extracted from your database.
This is similar to how web search services like Google works. Google dispatches thousands of web crawlers that spend their time crawling around the Internet and adding anything they find to Google’s search database. This is the real search process. When you do a search from Google’s web site it really just does a read on its database (it actually breaks up what you are searching for and does a bunch of reads). The bottom line though is that when you search, it is really just doing a direct read on its database and this is why it’s so fast. You can see why this is Big Data since this database contains the results for every possible search query.
This is quite different than a relational database where you search with SQL and the search goes out and rifles through the database to get the results. In a SQL database putting the data into the database is quite fast (sometimes) and then reading or fetching it can be quite slow. In NoSQL or BigData type databases much more time goes into adding the data, so that certain predefined queries can retrieve what they need instantly. This means the database has to be designed ahead of time to optimize these queries and then often inserting the data takes much longer (often because this is where the searching really happens).
Elastic Search is designed to scale out from the beginning, it automatically does most of the work for starting and creating clusters. It makes adding nodes with processing and databases really easy so you can easily expand Elastic Search to handle your growing needs. This is why you find Elastic Search as the engine behind so many large Internet sites like GitHub, StumbleUpon and Stack Overflow. Certainly a big part of Elastic Search’s popularity is how easy it is to deploy, scale out, monitor and maintain. Certainly much easier than deploying something based on Hadoop.
When you index your data its fed through a set of analyzers which do things like convert everything to lower case, split up sentences to lower case, split words into roots (walking -> walk, Steve’s -> Steve), deal with special characters, dealing with other language peculiarities, etc. Elastic Search has a large set of configurable analyzers so you can tune your search results based on knowledge of what you are searching.
One of the coolest features is fuzzy search, in this case you might not know exactly what you are searching for or you might spell it wrong and then Elastic Search magically finds the correct values. When ranking the values, Elastic Search uses something called Levenshtein distance to rank which values give the best results. Then the real trick is how does ElasticSearch do this without going through the entire database computing and ranking this distance for everything? The answer is having some sophisticated transformers on what you entered to limit the number of reads it needs to do to find matching terms, combined with good analyzers above, this turns out to be extremely effective and very performant.
Notice that since these search engines don’t search the data directly they won’t be real time. The search database is only updated infrequently, so if data is being rapidly added to the real database, it won’t show up in these type of searches until the next update processes them. Often these synchronization updates are only performed once per day. You can tune these to be more frequent and you can write code to insert critical data directly into the search database, but generally it’s easier to just let it be updated on a daily cycle.
When searching enterprise databases there has to be some care on applying security, ensuring the user has the rights to search through whatever they are searching through. There has to be some controls added so that the enterprise API can only search indexes that the user has the rights to see. Even if the search results don’t display anything sensitive they could still leak information. For instance if you don’t have rights to see a customer’s orders, but can search on them, then you could probably figure out how many orders are done by each customer which could be quite useful information.
Certainly when returning search results you wouldn’t reveal anything sensitive like say salaries, to get these you would need to click a link to drill down into the applications UI where full security screen is done.
Elastic Search is a very powerful search technology that is quite easy to deploy and quite easy to integrate into existing systems. A lot of this is due to a powerful RESTful API to perform operations.
In the early days of computing you could only run one program at a time on a PC. This meant if you wanted to run 10 programs at once you needed 10 computers. Then bit by bit multitasking made its way from mainframes and Unix to PCs, which allowed you to run quite a few programs at a time. Doing this meant you could run all 10 programs on one computer and this worked quite well. However it was still quite a high overhead since each program used a lot of memory and switching between them wasn’t all that fast. This lead to the idea of multi-threading where you ran very light weight tasks inside a single program. These used the same memory and resources as the program they were running in, so switching between them was very quick and the resources used adding more threads was very minimal.
Enter the Web
Think about how this affects you if you are building a web server. You want to basically run your programs on the web server and consider if you are running in the cloud. If you were single process then each web user running your app would have a separate VM to handle his requests and he would interact with that VM. There would be a load balancer that routes the users requests to the appropriate VM. This is quite an expensive way to run since you typically pay quite a bit a month for each VM. You might be surprised to learn that there are quite a few web applications that run this way. The reason they do this is for greater security since in this model each user is completely separated from each other since they are really running against separate machines.
The next level is to have the web server start a separate process to handle the requests for a given user. Basically when a new user signs on, a new process is started and all his requests are routed to this process. This model is typically used by applications that don’t want to support multi-threading or have other concerns. Again quite a few web applications run this way, but due to the high resource overhead of each process, you can only run at best a hundred or so users per server. Much better than one per VM, but still for the number of customers companies want to use their web site, this is quite expensive.
The next level of efficiency is to have each new user that signs on, just start a new thread. This is then way less overhead since you use only a small amount of thread local storage and switching between running threads is very quick. Now we are getting into have thousands of active users running off each web server.
This isn’t the whole story. The next step is to make your application stateless. This means that rather than each user getting their own thread, we put all the threads in a common pool. Then when a request for a user comes in, we just use a free thread from the pool to process the request. This way we don’t keep any state on the server for each user, and we only need the number of threads to be able to handle the number of active requests at a given time. This means while a user is thinking or reading a response, they are using no server resources. This is how you get a web applications like Facebook that can handle billions of users (of course they still use tens of thousands of servers to do this).
These techniques aren’t only done in the operating system software, but modern hardware architectures have been optimized for these techniques as well. Modern server CPUs have multiple cores which are very efficient at running multiple threads in parallel. To really take advantage of the power of these processors you really need to be a multi-threaded application.
Sage 300 ERP
As Sage 300 moves to the cloud, we have the same concerns. We’ve been properly multi-process since our 32-Bit version, back in the version 4 days (the 16-Bit version wasn’t really multi-process because 16-Bit Windows wasn’t properly multi-process).
We laid the foundations for multi-threaded operation in version 5.6A and then fully used it starting with version 6.0A for the Portal and Quote to Orders. Since then we’ve been improving our multi-threading as it is a very foundational component to being able to utilize our Business Logic Views from Web Applications.
If you look at a general text book on multi-threading it looks quite difficult since you are having to be very careful to protect the right memory at the right time. However a lot of times these books are looking at highly efficient parallel algorithms. Whereas we want a thread to handle a specific request for a specific user to completion. We never use multiple threads to handle a single request.
From an API point of view this means each thread has its own .Net session object and its own set of open Sage 300 Business Logic Views. We keep these cached in a pool to be checked out, but we never have more than one thread operating on one of these at a time. This then greatly simplifies how our multi-threading support needs to work.
If you’ve ever programmed our Business Logic Views, they have had the idea of being multi-threaded built into them from day 1. For instance all variables that need to be kept from call to call are stored associated with the view handle. There are no global variables in these Views. Further since even single threaded programs open multiple copies of the Views and use the recursively, a lot of this support has been fully tested since it’s required for these cases as well.
For version 5.6A we had to ensure that our API had thread safe alternatives for every API and that any API that wasn’t thread safe was deprecated. The sort of thing that causes threading problems is if an API function say just returns TRUE or FALSE on whether it succeeds and then if you want to know the real reason you need to check a global variable for the last error return code. The regular C runtime has a number of functions of this nature and we used to do this for our BCD processing. Alternatives to these functions were added to just return the error code. The reason the global variable is bad, is that another thread could call one of these functions and reset this variable in between you getting the failed response and then checking the variable.
If you’ve worked with our Views you will know that they are quite state-full. We can operate statelessly for simple operations like basic CRUD operations on simple objects. However for complicated data entry (like Order Entry or Invoice Entry) we do need to keep state while the user interacts with the document. This means we aren’t 100% stateless at this point, but our hope is that as we move forwards we can work to reduce the amount of state we keep, or reduce the number of interactions that require keeping state.
Fortunately testing tools are getting better and better. We can test using the Visual Studio Load Tester as well as using JMeter. Using these tools we can uncover various resource leaks, memory problems and deadlocks which occur when multiple threads go wrong. Static code analysis tools and good old fashioned code reviews are very useful in this regard as well.
As we progress the technology behind Sage 300, we need to make sure it has the foundations to run as a modern web application and our multi-threading support is key to this endeavor.
This is a guest blog posting by my wife, Cathalynn Labonté-Smith, though I’m the one answering the questions.
It may seem odd to readers to interview the man I’ve looked across the dinner table at for 29 plus years in his own blog, but we’ve had a recent addition to our household, Ian. Steve’s nephew is an enthusiastic young man who is in a programmer’s boot camp (see Steve’s Blog entry The Times They Are a Changin) and as an educator this has brought to my mind new questions for my darling husband beyond, “How was your day?” and “Will you be able to fit in a vacation around your business travel this year?” Also, he didn’t like my alternate idea of a Valentine to Computing.
We got out of the habit of talking about the details of Steve’s work since the time I worked as a technical writer in the field of wireless technology nearly a decade ago. For couples out there who both work in the same or related fields, you will know what I mean when I say it’s just best to unwind and avoid topics to do with work in the off hours.
When I left tech writing and became a teacher, occasionally I’d walk into a business class that was learning Accpac for Windows or Simply Accounting. Trained as an English teacher I’d do what all on-call teachers do when outside their subject area: stick to the lesson plan, get help from the brightest students in the class and muddle through as best I could. So it was fun to share those experiences with Steve and I actually learned a bit about the Sage products.
It’s been many years since I’ve been in the classroom, but having taught career preparation I want to know the following from Steve for programmers coming on stream. I know that Steve’s blog audience is unlikely to be junior programmers but I thought this might get his more senior executive readers thinking about what legacy they can pass along to new programmers.
Whoa, I can hear you say, what makes you think they can hear us with their ears jammed with ear buds and if they could we don’t speak their lingo? I’m not saying they’re going to sit through a PowerPoint of your ruminations and really the best example is modelling, after all, and as a teacher I found that it was an equal exchange. You can learn as much from your novice employees as they can learn from you–just about different things.
When I met Steve he was a Teacher’s Assistant in the Math Department at the University of British Columbia working on his Master’s Degree. His Math 100 class was just him, the blackboard, a huge lecture hall packed full of nervous first-years and a piece of chalk. I was never his student, no; I was on the other side of campus in Creative Writing workshops in poetry, fiction and children’s writing.
After his degree, he worked at various software companies in many different fields as a contractor, consultant or employee before finding his long-time home at Sage. Aside from having over twenty years at Sage now in his current role as Chief Architect, I’m curious as what Uncle Steve would say to Ian if he were around longer than it takes for him to gulp down his dinner and head upstairs for more studying?
1. Steve, what kind of guidance can you offer for formal programs a would-be programmer should choose for the best future employment and advancement? Can you compare it to your formal programming education?
My undergraduate and master’s degrees are in Mathematics and not Computer Science. However’ I took a few CS courses along the way (in things like Numerical Analysis and Operations Research), so strictly speaking I don’t have a formal CS background.
I was in the Co-op program at the University of Victoria so when I did graduate I had four work terms of job experience. Plus, I was always working on some sort of programming project on my trusty Apple II Plus computer (usually involving Fractals).
It doesn’t really matter so much which programming languages you learn, just learn a variety. After all, things are changing so fast these days that you need to expect to keep learning these as you progress through your career.
To summarize, you need something that will give you lots of practice programming, a few formal courses to give you credibility and you need to be a voracious reader.
2. In your undergraduate degree, you went through a co-op program. Is this something that you recommend and why? For example, does it make a programmer more desirable as a future employee?
A. Yes, absolutely. I think intern type programs are terrific ways to get job experience and references ready for that first real job. I did four co-op work terms and learned an awful lot about how various companies operate and what is involved. It is a great chance to get some experience with a variety of companies, perhaps a large one, a small one and a government one. I certainly give credit for co-op work terms when I’m hiring.
3. What kind of summer, part-time or volunteer work might add to and develop their skills?
A. I would look for something where you are giving back to the community, such as donating your time to a charity and if you have the chance to travel when you do this then even better. Again do something that interests you and you are passionate about.
4. What kind of advice can you give new programmers about how to pick their first employer?
A. Chances are you are going to have several jobs throughout your career. More than likely the pay will be similar, so go for something interesting. Do some research on the companies you are applying to and look beyond the initial job you will have there. Also, consider travelling to a new location for your first job to get a bit more experience of the world as well.
5. Just like some doctors are better at staying current on the latest treatments and research, how do programmers stay current when there seems to be so many new technologies and programming languages to learn. How do you manage to filter through all of it to get what will last and have future value? Or is it even critical that programmers do stay current or is there enough maintenance work to go around forever?
A. I think the number one rule is to not rely on your employer for this. This is really your own professional responsibility. Employers will train you for what you need immediately but usually not for much else and not for things that they aren’t interested in.
One of the great things about the profession today is that most of the programming tools that are important are either open source or have free versions available (like Visual Studio Express). So you can dabble with all sorts of things in your spare time. All you really need is a computer and an Internet connection. I really believe in learning by doing. So pick something new and interesting and do a small project in it to see if you want to go deeper.
6. What are some common pitfalls new programmers could avoid in their early careers?
A. I think the most common pitfalls are either being too loyal to a company or giving up on a company too easily.
Often people in their career have very high and probably unrealistic expectations on how well a company is run. Often this gives rise to a lot of changing jobs after quick stints. This can be a mistake if you don’t get ahead and develop a resume with lots of short stays.
The reverse is the other common mistake—being in a job that doesn’t work, but trying to stick it out too long rather than cutting the cord. Leaving is often a hard decision to make, but is often easier earlier in your career. Finding the right compromise between these two extremes can be very difficult.
7. What is the most valuable lesson or lessons that you’ve learned throughout your career that you could share with a new programmer?
A. That things are often darkest before the dawn. On any project at some point things are going to look bad, problems look unsolvable, bugs are piling up and deadlines are being missed. The lesson here is not to take the whole world’s problems on your shoulders, but to just work through the problems one by one. Often these are difficult problems that take much more time than you would have thought, but sticking to this eventually yields the light at the end of the tunnel.
Another take on this is to remain optimistic in the face of adversity. Or follow the Hitchhiker’s Guide to the Galaxy’s main advice: Don’t Panic! (Their other advice of always carry a towel, I’m not so sure about).
8. Who were your early role models?
A. Bill Gates and Steve Wozniak for what they did to start their companies. Steve Jobs for what he did when he returned to Apple.
9. Is there anything you would have done differently in your early career knowing what you know now?
A. There are always so many shoulda coulda wouldas. Now I know which companies back then paid the big bucks in stock options, but it’s hard to predict when looking forwards. I sometimes wonder if I should have moved from Vancouver, but then you get a beautiful day like today and just say “Nah”.
10. Is there a question that I didn’t ask that you wished I did?
A. No, this blog is already getting quite long J.
Point taken, Steve, this is a good place to wrap it up. Oh and, Happy Valentine’s Day, to you and to all your readers.
We hold a developer’s exchange (DevEx) every couple of weeks where one of our developers volunteers to present to all the other developers in our office. This past week I presented at the DevEx on what all the core DLLs in our Sage 300 runtime folder do. I thought this might be of interest for a wider audience so here are the gory details.
Our marketing supplied architecture diagram is the following which highlights our three tiers and hide a lot of the details of how the object repository, APIs and supporting services are implemented. I’ve blogged previously on our Business Logic Views. In this article I’m going to go into more detail on all the DLLs that provide the framework to support all of this.
Lower Level DLLs
If you are an ISV developing Sage 300 SDK applications or have worked for Sage on the 300 product then you will have had to encounter a number of these DLLs. I’m only looking at a subset of current DLLs, and I’m not looking at all the DLLs that support older technologies that are still present to maintain compatibility with add-ons.
I didn’t add arrows to this diagram since everything pretty well calls everything else below it. But segregated the DLLs a bit by how low or high level they are. So here is a quick synopsis of each one:
A4wcompat.dll: We created this DLL back when we did a native port of the Sage 300 Views for Linux. This DLL isolates operating system differences that need more than some clever #defines. A big part of this is the thread and process synchronization and locking support. Even though we never released the native Linux version, this isolation of operating system dependent parts had made adding multi-threading support, 64 bit support and Unicode support easier.
A4wmem32.dll: In 16 bit Windows, the built in memory management was really slow, so everyone used their own. Now this DLL uses the Windows and C default memory management, but is still important for global memory that needs to be shared across processes. Originally this was done through the data segment of a fixed DLL, but now is done through memory mapped files.
A4wlleng.dll: This is just a language DLL that holds some lower level error messages used by System manager.
A4wsqls.dll: This is the SQL Server database driver (there is also a4worcl.dll for Oracle and a4wbtrv.dll for Pervasive.SQL). This is dynamically loaded based on the type of database you are connecting to. For more on our database support see this article.
Cato3msk.dll, cato3dat.dll: The cato3 DLLs are the old CA common controls. We don’t use these in our UIs anymore, but cato3msk.dll provides our mask processing that is used by the Views. Similarly we don’t use this date control, but do use a routine here to format dates in error messages correctly.
A4wroto.dll: This handles the loading of the various View DLLs as well as the various UIs we’ve used in the past. It loads the roto.dat files and handles loading the right DLLs when View subclassing is going on or stub Views need to be used.
A4wsem.dll: This handles the locking of the semaphor.bin file. It allows processes to lock the company database, an application or the whole site. It also handles application specific cross workstation locking needs.
A4wrv.dll: This is the main DLL API entry point for the Views. It manages all the calling of the Views and handles other tasks like sending the calls for macro recording. For more on our View interfaces see this article.
A4wapi.dll: This is quite a hodge-podge of services for the Views like revision lists, error reporting and such. It also has support routines for the older CA-Realizer UIs. This is quite a big DLL and has most of our C level API in it.
A4wrpt.dll: This is our interface to Crystal Reports, it started as our interface to CA-RET then was converted to Crystal using their CRPE DLL interface, then converted to Crystal’s COM interface and now uses Crystal’s .Net Interface.
A4wprgt.dll: This DLL handles replicating the system database tables into the various company databases when needed.
A4wmtr.dll: This is our meter DLL for long running processes. It can either put up a meter dialog or just report back to the caller, the current status and percent complete. It also provides the API for cancelling long running processes.
Higher Level APIs
The next level are some of the DLLs that make up our Java, COM and .Net interfaces. There is a bit of complexity here due to how our previous web deployed system worked. Here we could communicate back to the server originally using DCOM and then later with .Net Remoting. The .Net Remoting layer provides both the communications layer for this web deployed mode and also acts as our .Net API. Depending on how you create your original session will configure which actual DLLs are used and which are calling conventions are used.
A4wapiShim.dll: This is the C side of our Java JNI layer. It talks to all the lower level DLLs to get its work done.
Sajava.jar: This is the Java side of our Java JNI interface. This allows Java programs to easily call Java classes to interface to our Business Logic Views. For more on this interface see this article.
A4wcomsv.dll: This is the main workhorse for the COM and .Net APIs. It does all the heavy lifting and interfacing to the core DLLs.
Accpac.Advantage.COMSVR.Interop.dll: This just performs the .Net to COM transition which is created by the MS tools.
Accpac.Advantage.Server.dll: Server side of the .Net API, handles the .Net Remoting requests if remotely called or just passes through otherwise.
Accpac.Advantage.Types.dll: Defines all the various types we use in our .Net API.
Accpac.Advantage.dll: This is the main external interface for our .Net API. For more on our .Net API see the series of articles starting with this one.
A4wcomexps.dll: Used when the VB UIs are going to talk .Net Remoting, this DLL is inside a4wcomex.cab.
A4wcomex.dll: The main entry point for the COM API.
Many More DLLs
There are many more DLLs in the Sage 300 runtime, but most of the others are for obsolete APIs like the xapi, the older a4wcom COM API, the cmd API, the icmd API, etc. There are other important ones like to do with Database Setup, but these are the main ones used when you talk to the Business Logic through one of the main popular APIs.
For anyone interested this should give you a good idea of what the main DLLs in the runtime folder do. And give you an idea of how the various services in Sage 300 ERP are layered.
My first computer was an Apple II Plus, which didn’t even support lower case characters. Everything was upper case. To do word processing you used special characters to change case. Now we expect our computer to not just handle upper and lower case characters, but accented characters, special symbols, all the Asian language characters, all the Arabic characters and everything else.
In the beginning there was ASCII which allowed computers to encode the alphabet, numbers and the common typewriter characters, all 127 of them. Then we added another 127 characters for accented characters. But there were quite a few different accented characters so we had a standard first 127 characters and then various options for the upper 127 characters. This allowed us to handle most European languages on computers. Then there was the desire to support Chinese characters which number in the tens of thousands. So the idea came along to represent these as two bytes or 16 bits. This worked well, but it still only supported one language at a time and often ran out of characters. In developing this there were quite a few standards and quite a bit of incompatibility of moving files containing these between computers systems. But generally the first 127 characters were the original ASCII characters and then the rest depended on the code page you chose.
To try to bring some order to this mess and make the whole problem easier, Unicode was invented. The idea here was to have one character set that contained all the characters from all the languages in the world. Sounds like a good idea, but of course Computer Scientists underestimated the problem. They assumed this would be at most 64K characters and that they could use 2 bytes to represent each character. Like the 640K memory barrier, this turned out to be quite a bad idea. In fact there are now about 110,000 Unicode characters and the number is growing.
Unicode specifies all the characters, but it allows for different encodings. These days the two most common are UTF8 and UTF16. Both of these have pros and cons. Microsoft chose UTF16 for all their systems. Since I work with Sage 300 and since we are trying to solve this on Windows that is what we will discuss in this article. To convert Sage 300 to Unicode using UTF8 would probably have been easier since UTF8 was designed to give better compatibility with ASCII, but we live in a Windows UTF16 world where we want to interact well with SQL Server and the Windows API.
Microsoft adopted UTF16 because they felt it would be easier, since basically each string became twice as long since every character was represented by 2 bytes. Hence memory doubled everywhere and it was simple to convert. This was fine except that 2 bytes doesn’t hold every Unicode character anymore, so some characters actually take 2 16-byte slots. But generally you can mostly predict the number of characters in a given amount of memory. It also lends itself better to just using array operations rather than having to go through strings with next/previous operations.
Windows took the approach that to maintain compatibility they would offer two APIs, one for ANSI and one for Unicode. So any Windows API call that takes a string as a parameter will have two versions, one ending A (for ANSI) and one ending in W (for Wide). Then in Windows.h if you compile with UNICODE defined then it uses the W version, else it uses the A version. This certainly adds a lot of pollution to the Windows API. But they maintained compatibility with all pre-existing programs. This was all put in place as part of Win32 (since recompiling was necessary).
For Sage 300 we’ve resisted going all in on Unicode, because we don’t want to double the size of our API and maintain that for all time, and if we do release a Unicode version then it will break every third party add-in and customization out there. We have the additional challenge that Unicode doesn’t work very well in VB6.
But with our 64 Bit version, we are not supporting VB6 (which will never be 64Bit) and all third parties have to make changes for 64 Bit anyway, so why not take advantage of this and introduce Unicode at the same time? This would make the move to 64 Bit more work, but hopefully will be worth it.
Why Switch to Unicode
Converting a large C/C++ application to Unicode is a lot of work. Why go to the effort? Sage 300 has had a traditional and simplified Chinese versions for a long time. What benefits does Unicode give us over the current double byte system we support?
One is that in double byte, only one character set can be installed on Windows at a time. This means for our online version we need separate servers to host the Chinese version. With Unicode we can support all languages from one set of servers, we don’t need separate sets of servers for each language group. This makes managing the online server farm much easier and much more uniform for upgrading and such. Besides our online offerings, we have had customers complain that when running Terminal Server they need separate ones for various branch offices in different parts of the world using different languages.
Another advantage that now we can support mixtures of script, so users can enter Thai in one field, Arabic in another and Chinese in another. Perhaps a bit esoteric, but it could have uses for optional fields where there are different ones for different locales.
Another problem we tend to have is with sort orders in all these different incompatible multi-byte character systems. With Unicode this becomes much more uniform (although there are still multiple of these) and much easier to deal with. Right now we avoid the problem by limiting key fields to upper case alphanumeric. But perhaps down the road with Unicode we can relax this.
A bit advantage is ease of setup. Getting the current multi-byte systems working requires some care in setting up the Windows server that often challenges people and causes problems. With Unicode, things are already setup correctly so this is much less of a problem.
Converting Sage 300
SQL Server already supports Unicode. Any UI technology newer than VB6 will also support Unicode. So that leaves our Business Logic layer, database driver and supporting DLLs. These are all written in C/C++ and so have to be converted to Unicode.
We still need to maintain our 32-Bit non-Unicode version and we don’t want two sets of source code, so we want to do this in such a way that we can compile the code either way and it will work correctly.
At the lower levels we have to use Microsoft’s tchar.h file which provides defines that will compile one way when _UNICODE is defined and another when it isn’t. This is similar to how Windows.h works for the Windows runtime, only it does it for the C runtime. For C++ you need a little extra for the string class, but we can handle that in plustype.h.
One annoying thing is that to specify a Unicode string in C, you do l”abc”, and with the macro in tchar.h, you change it to _T(“abc”). Changing all the strings in the system this way is certainly a real pain. Especially since 99.99% of these will never contain a non-ASCII character because they are for debugging or logging. If Microsoft had adopted UTF8 this wouldn’t have been necessary since the ASCII characters are the same, but with UTF16 this, to me is the big downside. But then it’s pretty mechanical work and a lot of it can be automated.
At higher levels of Sage 300, we rely more on the types defined in plustype.h and tend to use routines form a4wapi.dll rather than using the C runtime directly. This is good, since we can change these places to compile for either and hide a lot of the details from the application programmer. The other is that we only need to convert the parts of the system that deal with the database and the parts that deal with string handling (like error messages).
One question that comes up is what will be the length on fields in the database? Right now if it’s 60 characters then its 60 bytes. Under this method of converting the application the field will be 60 UTF16 characters for 120 bytes. (This is true if you don’t use the special characters that require 4 bytes, but most characters are in the standard 64K block).
Moving to both 64 Bits and Unicode is quite an exciting prospect. It will open up the doors to all sorts of advanced features, and really move our application ahead in a major way. It will revitalize the C/C++ code base and allow some quite powerful capabilities.
As a usual disclaimer, this article is about some research and proof of concept work we are doing and doesn’t represent a commitment as to which future version or edition this will surface in.