Escape from ETL Hell: The Real Time Enterprise


Five 9's--the universal symbol of reliability in high-tech. Five 9's represents 99.999% uptime or just a little over five minutes of allowed downtime per year. Achieving five 9s isn't easy. Operations organizations that do achieve it do it, in part, by carefully instrumenting and monitoring systems using expensive software like HP OpenView and IBM's Tivoli. These kinds of systems aren't cheap, but operations managers know that you can't manage what you can't measure and with only five minutes to spare in any given year, they'd better have that information in real time.

Technical operations managers may have pioneered the ideas behind real time information access and alerting, but business managers are learning that they can play this game as well. Good companies have long had accounting practices and other metrics in place to give them insight into how their business is operating. Great companies often make use of data warehouses to gather extensive information about every aspect of their business and allow users to query the data and create reports that can tell just about anything they want to know--as long as it happened a few days ago.

That's the problem. Data warehouses are trapped by what we might call ETL hell. ETL stands for "extract, transform, and load." ETL is necessary because the data resides on myriad systems in multiple formats. The data has to be extracted from all those systems, transformed into a common schema, and then loaded onto the data warehouse. Often this process can take days for some data.

A real time enterprise is free from the batch-process nature of data warehouses. Instead, data flows, in real time from the systems of interest, through translation services, and through a rules engine that can be configured to look for specific trends, coincident events, and other interesting activity. Managers can watch dash boards of this data as it changes, be automatically alerted to unusual conditions, and those alerts can be automatically escalated to managers when they're not handled in a timely manner. Systems people have had these kinds of tools for years, but the business side of the house is just seeing the benefit.

Imagine that you work for a large bank. Your job is to manage mortgage sales. In a good company, you see sales figures, broken down by region and mortgage type for the last three months and cumulative for the year. In a great company, all this data is in a data warehouse and you can query this information and almost any other you can think of on yesterday's sales. In a real-time company, you see a slump in the last hour's sales of 15 year mortgages and an increase in 5-year adjustable rate mortgages and can adjust your strategy for reselling the paper this evening instead of tomorrow afternoon.

Of course, real time data isn't a panacea. Most people have heard about CISCO's much vaunted real time sales system over predicting sales in 2001 because customers had gotten in the habit of double ordering to assure they got something back in the go-go days. Operations managers have learned that the key to solving these problems is managing process and holding frequent post mortem evaluations of mistakes and reporting errors so they can be avoided the next time.

These systems aren't just for executive management. To be effective, the company has to be prepared to deliver real time information to each worker just when they need it. Duy Beck of the Virtua Group called this the "virtual network of demand." Getting work done in any large organization is a function of workflow (formal or informal). Workflow gets things from one person to another in the right order, at the right time so they can act on them and send them on.

Another way to think of this is as every person representing a little bit of production capacity with their own supply chain and demand chain. All of these internal supply and demand chains represent a virtual network of demand. Getting business done requires finding ways to efficiently and effectively service this network and keep it flowing.

From an IT perspective, when we install CRM systems, ERP systems, employee portals, workflow systems, personal computers, office suites, and the like, we're trying to service and automate this demand network. The problem is that we can't, yet, approach it from the standpoint of viewing each employee as a custom unit that has specific needs because of their role, their style of work, the way they learn, the way that they're most comfortable communicating, etc. We more or less give everyone a standard set of tools and require them to do their own customization.

Real time enterprises are enabled by light-weight integration tools, collaboration software, wireless and mobile computing, Web services standards like SOAP and XML, peer to peer computing, and even instant messaging. As these technologies and other like them find their way into the enterprise, real time business decision making systems will become easier and easier to build and deploy. Nimble companies will deploy them to avoid information latency and the loss of competitive advantage that comes with it.