The Dark Side of Objects

The Dark Side of Objects?  (Luke, I am your father.) 

Sometimes you need reins on developers and designers.  Not because they aren’t doing a good job, but because if you don’t you may end up in a quagmire of objects that no one can understand. Objects are good, but they can be overdone.  Not everything should be an object and not everything lends itself to being objectified.  Sometimes a developers goes too deep when trying to create objects.

When I was learning about objects I had a great mentor who understood the real world boundaries of objects:  when to use them, how to use them and far to decompose them into additional objects.  Shortly after having "seen the light" with regard to objects I was helping a young man (okay, at my age everyone is young) write an application which was actually the sequel to the data entry application I mentioned in the previous note.  He needed to do some funky calculations so he created his own numeric objects.  Instead of using the built in Integer types he decided that he would create his own Number object.  This number object would have a collection of digits  When any calculations needed to be done he would tell one of the digits the operation to be performed and let that digit tell the other digits what to do.  Well, this gave him a method whereby he could perform any simple numeric operation (+-/*) on a number with a precision of his own choosing.  He spent weeks on perfecting this so that his number scheme could handle integers and floating point numbers of any size.  It was truly a work of art. 

And weeks of wasted time.

What he needed to do was multiple two numbers together or add up a series of numbers.  Nothing ever went beyond two decimal points of precision and no amount was greater than one million.  These are all functions built into the darn language and didn’t need to be enhanced or made better.  The developer got carried away with objects and objectified everything when it didn’t or in this case, shouldn’t have been done.

Knowing when to stop using objects is just as important as knowing when to use objects.

Long Running Web Services

OK, the world is moving to web services.  I know that.  You know that.  So what more is there to discuss?  PLENTY!! For instance, how long should it take for a web service to complete?

Well, that’s kind of a tricky question.  It basically comes down to "what is the web service doing?"  Some things should come back quickly.  Darn quick, in fact.  For instance, if you ask a web service, "What is the name of the person associated with this identifier?" you should be getting a response back in milliseconds.  If you are asking a web service "What course marks did this student get in high school?" you should be getting a response back in milliseconds.  If you are asking a web service "What are the names of all of the people associated with this school district?" you should be getting a response back in milliseconds.

What?  Getting the names of hundreds, potentially thousands of people, in just milliseconds?  Are you nuts?

Read carefully what I wrote "… getting a response back in milliseconds."  A perfectly valid response is: "Thank you for your request.  It is being processed and the results will be made available at a later date."  Web services should not be long running processes.  If they are long running or have the potential to be long running, then you need to implement some sort of callback mechanism to indicate when processing has finished.  This callback mechanism may be an email, a call to another web service with the results, depositing the results in a queue to be picked up later or even a combination of these methods.  Indeed, there are literally dozens of ways to get the response back to the caller.  What is important to understand is that you do not create a web service that has the potential to run for a long period of time.  Ever.  I’m serious about this one.

Other than the fact that you are probably going to hit the HTTP timeout, COM+ timeout, or cause an excessive number of locks to be held in the database, why other reasons could their be?  Well, imagine from a Denial of Service perspective.  If one call to this web service can generate so much synchronous work, what would 10, 100, or even a 1000 simultaneous calls to this web service due to the underlying server?  Crash it?  Cause it to perform so slowly that everyone gets timed out?  Bring down the back end database?  "But, Don, this is going to be an internal web service only.  What’s the problem?"  Depending upon which survey you read, anywhere from 10% to 50% of all attacks on web sites are from insiders.  Imagine a disgruntled employee and what damage he could do to the system with just a little bit of knowledge.

While this topic is ostensibly about web services, we should not create any service (COM+, Web, WCF enabled) that takes a long time to execute.  If you are in the least bit confused about whether something should be synchronous or asynchronous in nature, the odds are it should be asynchronous.  Err on the side of caution.

The Good Old Days

Why is it that the good stories start with "When I was younger …"?

Anyway, when I was younger I was working on project to replace an aging mainframe based system with a new web based application.  The web based system utilized the existing database and a new database to create a whole new system that greatly increased the functionality and usability of the entire product.  One of the response time requirements we had was not with regard to individual screens, nor with regard to specific business processes, but with the length of the transaction in the database.  In order to get as much throughput as possible and minimize the amount of locking the requirement was that the new system operate in much the same manner as the old system and provide an average database transaction length of no more than 0.4 seconds.

400 milliseconds.

That is not a lot of time no matter how you look at it.  This 400 millisecond period was the average length over the course of a business day, but did not include any batch or asynchronous processing that occurrred.  This helped us out considerably because we had a lot of short transactions which lowered the average and a smaller number of longer transactions that raised the average.

Man, did we suck when we went live.  Over 2000 milliseconds for the first week and this did not include any of the deadlocks or timeouts that occurred. It took months, actually 18 of them, before we had things down to not just 400 milliseconds, but an average of just over 300 milliseconds. New hardware on the mainframe helped, but so did the fact that we worked really hard at lowering that average and we understood that anything that was going to take a long time was immediately turned into an asynchronous process or even part of a batch run that night.

The users understood this change in philosophy.  In order to get good online processing for everyone involved there was a need to do things asynchronously or in batch.  Processing something in the middle of the day, when everyone is using the system, is not always necessary.  Even if a short turn around is desired, asynchronous processing can be a valuable alternative.

Funny, but this seems remarkably similar to yesterday’s note about web services and performance.  See, everything old is new again.  The problem isn’t new, but the technology is.  The solution isn’t new either, but the will to implement it might be.

When not to choose the default

I talked recently about how you should leave the defaults the way they are in many cases because, well, for most circumstances they are probably the best values to use.  Sometimes, however, the default doesn’t work that well and you need to understand the reasons why changing the default is a good thing.

Suppose you went to a web site and it asked you to fill in 10 fields.  You then hit submit and it came back and told you that field number 1 is suppose to be numeric, not alphanumeric.  You make the change and then submit.  It then comes back and tells you that filed number 4 is suppose to be less than field number 3.  You keep doing this for a number of changes until you finally say "Forget it" and you leave that site forever.  It’s happened to me, so I can honestly speak from experience.

My biggest problem with the process wasn’t so much the one error message at a time, but rather the fact that there was a round trip to the server for every error.  I had over half a dozen interactions with the server to fill out a darn form!!!  By default .NET sets the controls you place on an ASP.NET page to process interactions at the server (runat="server").  If you provide complete error checking for each page, then this may be a suitable method of operating.  However, if you only respond to the user one error message at a time, this is sure fired way of getting  someone annoyed with you.  And quickly.

To be brutally honest, some error checking should be done at the client side.  If you have a popular application, or even if it is not that popular, there is still a certain amount of overhead involved in getting IIS to receive the request, process the header information, pass the information along to the application pool and then have the application do what it needs to do in order to tell you that the field is only supposed to contain numbers.  Then the whole thread has to go backward, towards the user, in order to give them the message.  It is faster, more efficient, and less costly from an infrastructure point of view if you let the client take care of this sort of data validation.  Your application will also check when it gets the data from the client (don’t ever assume the client is sending you perfect data), but many checks can be performed at the client end, decreasing turn around time for error processing, distributing the work load, and, more importantly, providing a better user experience.

Remember, when you’re designing your application think in terms of what provides the best user experience.  Think about your experiences, what you’ve liked or, more importantly, what you’ve disliked, and go from there.  The default, while usually good, does not have to remain if there is a good reason to change.

Schopenhauer’s Law of Entropy

So, just what is Schopenhauer’s Law of Entropy?  Simply put, it is this:

If you put a spoonful of sewage in a barrel full of wine, you get sewage

So, what does sewage have to do with programming?  It’s not sewage that I’m looking at, but rather the concept behind it.  In IT terms, what Schopenhauer is saying is that no matter how good the overall application, if one part doesn’t work the whole application gets tarred with the bad brush. 

It is unfortunate that a single poorly designed, written or executed page can make someone believe that the entire application is poor.  Their perception of the application is what is important, not reality.  Kind of scary, isn’t it, when perceptions are more important than reality.  But this is what happens in our business and it is something that we need to understand and do our best to influence.

So what influences this perception?  Well, consider this:  two web applications side by side on your desktop.  You push a button on the left one and you get the ASP.NET error page:  unfriendly, cryptic and somewhat unnerving.  You push a button on the right one and you get an error message in English, that explains there is a problem and that steps are being taken to resolve the issue.  Which one would you perceive to be better written and robust? 

How about another example?  You push a button on the left application and you get an error message that says "Unexpected Error.  Press the OK button".  You push a button on the right application and you get an error message that says "Our search engine is currently experiencing some difficulties and is offline.  Please try again later."  Which one do you perceive to be better?  Which one do you think your business clients will think is better?

It’s not just one thing (error message or not) that gives you a feeling of confidence when dealing with an application, it is a multitude of little things.  Making things more personalized helps.  Translating from Geek ("Concurrency error") to English ("Someone else has updated the data before you") helps a lot.  Making it seem that you spent some effort to foolproof the system (i.e. don’t make every error number in your application the same error number).

No matter how good the rest of your application, one bad move can create sewage.

Load Balancing Failures

It shouldn’t come as a surprise to anyone when I say that our Production environment is load balanced.  I have mentioned this before and I will be mentioning it again in the future.  But, for those who may have missed my previous tirades, let me explain the impact of load balancing on Session State.

One of the features of ASP.NET is to store information specific to a browser session (aka user) into something called Session State.  Session State is kind of like the junk drawer you have at home where you have batteries, twist ties, plastic spoons, stud finders and assorted other "stuff".  Session State allows you store what you need to store in order to keep track of where the user is in the application and what data you need to save on their behalf.  The next time the user accesses the application the session state is automatically loaded and you’re ready to rock.

There are a number of places to store session state:  In Process, Out of Process or in SQL Server. 

In Process means that Session State is going to be stored in the Application Pool running the web site.  So, if the application pool recycles, all session state is going to be lost.  A number of projects currently use this method and are in danger of losing Session State, if they use Session State, as we use Application Pool recycling to solve a number of application issues.  In addition, if, for some reason, BigIP sends the user to a different web server to service the request then the Session State is not going to be present, potentially causing a number of application failures to occur

Out of Process is where the session state is hosted by ASP.NET in a different process, potentially on a different machine.  While somewhat safer than storing it in the same Application Pool, a problem arises if this service needs to be reset whereby the Session State is again lost.  Indeed, if the process is hosted on the same server as the web site, moving the request to another part of the load balanced cluster is going to be a problem as Session State will not be available for the request.  If Session State is stored on a separate machine then the biggest problem is that of durability of the data.  Any problems with the service may wipe out all session state for all machines.

Storing Session State in SQL Server is the slowest method, but is by far the safest method for durability and the best method when utilized in a cluster.  Each request for Session State goes out to SQL Server to ensure that the latest and greatest version of Session State for that user is retrieved and used.

In our environment we have asked people to use SQL Server Session State, and yet, by looking trough the web.config files of a number of projects I’ve noticed that they have their Session State set to In Process.  If Session State is actively being used, this is a recipe for disaster.  I urge each project team to take a quick look at their web.config files and change it to use SQL Server instead of In Process.  Even if you don’t currently use Session State, you may in the future and this will prevent you from having a nasty accident.

Just Because …

Just because you can do something, doesn’t mean you should.

Geeks like new things.  They like playing with new technology.  Using new tools.  Trying out stuff that they’ve never done before.  XML is a good example.  XML has a lot of really good useful purposes.  It allows for easier portability of data between disparate systems.  It is, for the most part, human readable.  It is good for configuration files.  Indeed, configuration files are probably one of the most commonly used applications for XML.

There is one thing (okay, probably more, but this topic needs to be short enough to keep your attention) that XML shouldn’t be used for:  continuously updated reference tables or codes tables.  For instance, you wouldn’t put the conversion rate between U.S. and Canadian dollars into an XML for processing by an application because it would involve daily updates to that file.  You probably wouldn’t put the top 10 grossing movies of the week into an XML file either, as that changes quite often as well.  You could, however, put a list of provinces in XML as that normally doesn’t change that often.

We have a carefully controlled production environment.  In order for any new files to be moved into Production we first deploy them to UAT, have them tested and then moved into Production.  You can see that by putting volatile data in an XML file you are creating a lot of extra work for you, the Deployment Team and even the users who need to test the changes as every change will create a lot of overhead.

What’s the alternative?  A database.  Reference tables that have the potential to be changed on a frequent basis should never be put into XML files, they should be placed into a database with the proper screens in place to change the data.  Putting this sort of data does nothing for your application in terms of flexibility or ease of use.  It was a matter of using a cool new toy when it didn’t need to be done.

Try and Catch

Please note that there is an error with this entry. See Rule #1 — Redone for details.

Exceptions are powerful tools that give the developer the ability to understand what is going on with their application, particularly when there is a problem. Sadly, many programmers do not use this feature, or implement it so poorly as to provide no meaningful information.

Rule #1: If you catch an exception, you better record it or re-throw it.

What does this mean? Take a look at the following code:

Catch ex As Exception

End Try

What this code does is catch the exception, then let it get away. It’s like putting cheese on a mouse trap, but then gluing down the trigger so that the mouse can get away. I mean, seriously, what are you thinking? At the least, at the very least, you should have the follow:

Catch ex As Exception

Throw ex

End Try

This at least re-throws the exception so that something above you can make a decision as to what needs to be done. If I had my druthers, however, it would be more like this:

Catch ex As Not Implemented

Throw New Exception("Blow Up3 does not implement that functionality", ex)

Catch es As Exception

Throw New Exception("Unforeseen error in Blow_Up3 trying to access Don’s bank account", ex)

End Try

There is so much more information available to the method that initially gets the exception than the calling method has that it would be a shame to ignore that information and make life more complicated for everyone.

Soccer Referee

Paraphrasing is a lost art in the IT world, but it is an art that really needs to be emphasized more. When I was younger I was helping a friend referee a soccer match. (Football to you foreigners) He wanted me to be a linesman and he told me that “When the red team kicks the ball out of bounds I want you to point the flag in the direction that the blue team will be moving when they get the ball on the throw in.” This made perfect nonsense to me as that seemed much too complicated, so I paraphrased it “You mean, point the flag at the team that kicked the ball out?” This confused him for a moment as he struggled to reconcile what I had repeated back to him with what he had told me, but he gradually agreed that the impact would be the same.

Sometimes when we write up specifications for an application we are too deep in the details and too aware of the intent, but not fully aware of the impact. We need to step back, take a look at what we have said or written, and see if we can rephrase it to make it simpler, yet still retain the same meaning. I do this quite often when writing these one minute comments. You should see some of the stuff that I write and throw out. (Then again, you have seen the stuff that I’ve gone ahead and sent out.) For instance, I’m currently writing this note because one on testing just doesn’t make any sense when viewed from outside the original context that most of the readers will not have.

The same thing is true of specifications. Not everyone reading the specification is going to have the same background as you or is operating with the same context. Not everyone is going to be an expert in the business area involved. (Or the subtleties of being a soccer referee.) What you write for a specification needs to be easy to understand, even for those that are unfamiliar with the business process. If it isn’t easy to understand then you need to step back, clear your mind, and try again. If it is hard for someone who knows the business to write the specifications, imagine how hard it is for someone not familiar with the application to understand what you have just written.