The Internet is one of those true revolutions that happens only occasionally in human history. Thanks to the Internet, the cost of communicating is undergoing a freefall to dramatically lower levels. Both Web surfing and email are cheap new forms of communication. As in previous such revolutions, our society is responding to these new opportunities. Many more people are communicating, and many more people are becoming literate in this new Internet medium.
At least four previous communication revolutions have had a similar impact. The introduction of the printing press in the 1500s first made knowledge widely accessible outside the church. The Penny Post of 1840 in England dropped the cost of mailing a letter from 20 pennies or more to only one penny anywhere in England. This action is widely credited with stimulating universal literacy in England. The volume of letter writing went up by an order of magnitude and letter writing become a national craze. Sound familiar?
The availability of the dial tone across America in the early 1900s knit together our families, businesses, and communities. Now our society simply couldn't function without the telephone.
The television also probably deserves to be on the list of communication revolutions, although it is a one-way medium. It has certainly homogenized our society.
Like these other media, the Internet is sending all of us some powerful communications about the medium itself. One of the Internet's messages is that there is a lot of useful information to be found out there. If I need to find out anything about computer hardware or software, travel information, health information, hobbies, or institutions of higher learning, I now instinctively turn on my Web browser and head out to surf. I am sure the Web will give me better information faster than any other medium I can think of.
Another message from the Internet medium is Alta Vista (or Excite or Yahoo). To be more accurate, the message isn't Alta Vista itself, it's the Little Box we type into. The message from that Little Box is: Anyone can get useful results from typing into the Little Box and pressing return. Millions of people have developed a kind of confidence in using computers because of this "absurdly" simple user interface. In retrospect, there is nothing absurd about the Little Box. It is a great lesson in how by making communicating cheaper and easier, we dramatically broaden our base of users. The Little Box in Alta Vista isn't a temporary phenomenon; it is a permanent icon of the Internet.
Finally, the Little Box in Alta Vista runs on absolutely every computer, maybe even on things that don't look like computers. I expect that the next microwave oven for my kitchen will let me run the Little Box so I can look up a recipe on the Internet.
Which brings me to data warehouse query tools: Data warehouse query tools are caught in the Internet whirlpool. After all, the query tool's job is to be the user's window onto the remote source of data called the data warehouse. Traditionally, a query tool has been a complex and somewhat expensive piece of software sitting on the user's workstation. Query tools are tricky to configure correctly within the surrounding applications and challenging to use. Most query tools have lots of special commands and lots of little subwindows. I can only have a query tool on a certain kind of computer. This doesn't sound like the Little Box in Alta Vista.
Query tool companies are in the process of losing their direct end-user franchises. The value of a query tool now must appear through an industry-standard browser such as Netscape or Microsoft, and in many cases, the interaction with the query tool will have to be mediated by a Little Box. IT shops are breathing big sighs of relief that they do not have to configure the actual end-user machines anymore. All the IT shop must do is provide the query facility on a Net-enabled server and let the myriad end users dial in from their 386s, 486s, Pentiums, Macintoshes, and microwave ovens.
Most of the query tool companies are responding to the whirlpool. The most obvious and significant change in their behavior so far is their transformation to server-based companies selling directly to the IT organization. A query tool company must now plan to sell $10,000 to $50,000 server licenses to the IT organization. Selling expensive server licenses to IT is a very different game from selling query tools to end-user organizations. Selling a server license may require a dedicated sales force to call on the customer and may require a "solutions" selling strategy where each deal takes several months to close. Sales and marketing teams in query tool companies are being reorganized to meet the Internet challenge.
The query tool companies are being forced to alter their technology in order to become Web-enabled server products capable of supporting hundreds of simultaneous users. Even if the query tool companies do nothing else, they must strip off their user interfaces so that they can be controlled though remote browsers. They also need to invest more deeply in multithreaded, multiprocessing internal architectures more appropriate to big server implementations. However, even exporting their user interfaces to remote browsers won't be enough. The message from the Little Box is: Make it even simpler. The average Internet data warehouse user expects the query tool to be as simple as the other Internet tools. I want to browse a million recipes from my microwave and determine which one is the cheapest to implement (cook) -- and I won't have a mouse.
The query tool companies are destined to become not only query server companies but three-tier architecture companies. As the remote browser user interfaces become simpler, the query server companies will have to embed more and more power in their middle layer. As I have remarked in this column, the layer below the query tool (the DBMS server) also suffers from being very simple. SQL has trouble expressing even the simplest business application. Thus the query server must sit between two dumb standardized entities, the Internet browser and the relational DBMS, that are both incapable of representing a business application. This middle query server layer will increasingly need to support application objects, application plans, and application intermediate results. Query server companies will compete on how powerful their middle layer is.
As query tool companies recognize that they are server software companies, they will begin to see other opportunities. There are a lot of opportunities to address on the data warehouse server. Query tool companies can increase the size of their product "bundle" by performing valuable back-room activities in addition to the traditional front-room querying and analysis. Query tool companies will broaden their scope to include data extract, cleaning, archiving, data combining, key generation, integrity checking, scheduling, DBMS loading, index building, aggregate building, performance monitoring, setting security profiles, and performing quality assurance.
The entry of query tool companies into the broader data warehouse software market will temporarily add to the confusion of vendors and products, but this market change is bound to have some very beneficial effects, which are overdue, in my estimation. Competition in the data warehouse back room is already heating up. Prices are dropping and tools are beginning to compete more effectively on usability and price. The number of data extract vendors in Larry Greenfield's wonderful Web site has grown from a couple of dozen to six full pages of vendors and products just in the last year. (You can find Larry's Data Warehouse Information Center at pwp.starnetinc.com/larryg/index.html.)
I believe that the industry needs fewer $250,000 monolithic data extract systems and lots more $5,000 and $10,000 modular data extract utilities. The Internet and the entry of many of the query tool companies into the server arena appears to be stimulating exactly that sort of product evolution.
I also hope that as query tool companies successfully expand the scope of their product lines, they will whittle away at the serious metadata problem. Most data warehouse teams are drowning in blobs of metadata from separate, incompatible tools. If the query tool companies can begin to unify the overall task of getting the data all the way from legacy systems to the desktops of end users, then they may each be able to put an envelope around the metadata required to support this overall data pipeline. Another strategy, of course, is to suggest that the data extract companies venture into the front room, to carry their extracted data all the way to the end user's desktop. But somehow this seems like a big stretch for most of those companies, especially the ones with a mainframe mentality.
The trouble with prognostications such as these, of course, is that they never turn out to be quite as simple as they seem. Something new and creative always seems to change things a little. For example, the founders of Brio Technology (makers of the BrioQuery tool) suggested to me that the scenario I've described in this article might turn out differently for them. They believe that the software footprint of their tool is so tight and efficient that they can actually transform their entire tool into a downloadable Internet applet. This would enable Brio in its entirety to sneak out onto the desktop of the end user while the other "heavy" query tools would have to stay marooned on the centralized server. Maybe Brio can pull this off and maybe it can't. Maybe the other vendors will respond by implementing lightweight applet versions of their tools as well. The Internet whirlpool is only going to spin faster.