As transactional data volumes increase, system architecture must stay flexible and be able to scale in accordance.
Back in September, the London Stock Exchange experienced a significant interruption when a proprietary system built on Microsoft technology went offline. Few details were shared, but I eventually cobbled together a rough explanation of what happened.
The stock exchange's system hung due to a "coincidence" (whatever that means) that stopped data from processing. What appears to have happened is several Windows processes, including message processing, crashed at the same time due to a configuration glitch. Because the applications were so directly tied in to Windows, the impact affected everything instead of just one component.
I spoke on the phone with Craig Hayman, vice president of IBM's WebSphere, discussing how open standards and design principles allow for more robust system architecture. Craig explained that the stock exchange incident was likely a result of being too dependent on a myopic structure rather than relying on a three-tier architecture that's been proven to scale.
It feels a bit old-school to talk about three-tier architectures in this day of Ruby apps built in 15 minutes, but the fact is you need separation and best-of-breed components when you are dealing with large transaction volumes and varietal peaks.
The stock exchange's data volume was not at a system-overload level. Rather, the architectural pattern likely didn't match the principle goal of the system--to deliver high-integrity transactions with varied levels of peaks and valleys.
Volumes are not steady state, and systems have to be prepared to handle the variety of high points, which is part of why we look to scale systems linearly. This is not a feasible approach when you are using built-in Windows components not running in separate tiers.
Note that the problem is not due to applications running on Windows, or specifically with .Net, but instead the scale issue associated with having to throw entire machines, instead of components at the problem.
On fixing such problems, Hayman's first suggestion is to not listen to salespeople and try to replace the technology with a mythical new future solution that is untested and still built into the operating system. (That goes for Microsoft as well as IBM salespeople.) Instead, there are vast resources available for developers and architects to follow smart design principles that adhere to open standards and allow for ready scale.
One possible way to address a system like the London Stock Exchange would be an Extreme Transaction Processing (XTP) approach and surround the database with a distributed cache and adds elasticity to the database. This is a common use-case for Memcached, an open-source, high-performance, distributed memory object caching system, used to speed up Web applications such as those that run Facebook and Twitter.
Ultimately, IT shops have legacy investments and want to leverage them to deliver new value. You can't just throw everything away; you need to be able to connect old and new systems using open standards that help maintain transaction integrity and allow for fine-grain control.
Follow me on Twitter @daveofdoom.