
Windows NT was born from the ashes of Microsoft's involvement in OS/2. Microsoft's team, led by Dave Cutler (of VMS fame), built what is currently the only mainstream operating system developed this decade. According to Microsoft, Windows NT cost $400 million to develop. The original release of Windows NT (3.1) contained about four million lines of code, and the current release of Windows NT (4.0) has gained a little weight, coming in at around 16.5 million lines of code. And at Microsoft's Professional Developers Conference in San Diego last September, Jim Allchin revealed that the beta of NT 5.0 now contains 27 million lines of code. Microsoft is currently investing $1 billion, or about half of its R&D budget, on Windows NT.
Windows NT was designed as a general-purpose operating system for the desktop user, workgroup server, departmental server, and enterprise server -- a tall order indeed.
Since its release in 1992, Windows NT has made enormous inroads in the high-end desktop market (at the expense of OS/2), in the workgroup market (at the expense of Netware), and in the departmental, Internet, and intranet server market (at the expense of Unix). Now with the upcoming release of Windows NT5, Microsoft wants the enterprise server market as well.
Windows NT for the mainframe! It sounds like heresy, but given fail-safe transaction processing, fault-tolerance, performance improvements, support for 64-bit technology, linear SMP scalability, and clustering, this goal may not be unrealistic for subsequent evolutions of Windows NT. Windows NT has succeeded at the desktop, workgroup, and departmental level because of the availability of cost-effective hardware and software for the Windows NT platform. For Windows NT to continue this success, it needs to avoid the monopolistic overtones of Wintel; namely, it needs to offer more non-Intel hardware options, not fewer (Rest in Peace: PowerPC, MIPS) and more third-party software.
All the leading database vendors now support Windows NT. Oracle with its Workgroup Server and Enterprise Server offering; Sybase with Sybase Adaptive Server; Informix Software with Informix-OnLine Workgroup Server, Informix-OnLine Dynamic Server for NT, and Informix Universal Server for NT; IBM with DB2 Universal Database for Windows NT; Computer Associates with CA-OpenIngres; and NCR's Teradata RDBMS. The release schedules of the DBMS products show an interesting trend: The NT releases are coming earlier and earlier on the porting and release schedules. In fact, Oracle8 on Windows NT is not a port of the Unix product but was developed and released simultaneously on both Windows NT and Unix. (See Robin Schumacher's article for more detail on Oracle8.)
What makes one operating system more appropriate than another for running a database? What does an operating system do for a database? This article examines these questions and considers the strengths and weaknesses of Windows NT as a database platform.
Fundamentally, operating systems provide abstractions built on the services provided by hardware and software (BIOS). There are two cumulative levels of abstraction available to Windows NT developers: the Win32 environment subsystem available through the Win32 API, and application services available through application APIs (such as Oracle Call Interface), COM (such as Oracle Objects for OLE), and DCOM.
In the Windows NT architecture, the Win32 environment subsystem calls system services provided by the Windows NT Executive. The Windows NT Executive runs in Kernel mode, which is a highly privileged mode of operation in which the code has direct access to all hardware and all memory. The Windows NT Executive provides operating system services such as processes, threads, files, memory, and sockets. The Win32 API includes more than 1,500 functions. This API is functionally richer than the Unix system call environment, but the learning curve is steep given the number of functions. This learning curve has given rise to the popularity of object-oriented libraries above the API, Microsoft Foundation Classes being the preeminent example. The abstraction level of these libraries comes at the cost of some performance.
The Win32 environment subsystem and Windows NT applications run in user mode. User mode is a less-privileged processor mode with no direct access to hardware. Code running in user mode acts directly only in its own address space. A plethora of Microsoft and third-party APIs are available as well as an increasing number of COM, OLE Automation, and ActiveX objects that Windows NT programmers can use when building applications. No other non-Win32 operating system can compete with this volume of reusable components.
Traditionally, applications were split into multiple processes, and some form of interprocess communications (IPC) was used to communicate among the processes. Process structure was hierarchical, with a parent process creating a number of child processes. NT breaks away from this hierarchical process structure in favor of multithreaded process structure. A multithreaded process has more than one thread of control sharing both address space and resources. Using threads eliminates the need for any IPC and reduces context-switching overhead. Threads can be scheduled concurrently on multiple processors. Unlike Unix, Windows NT was designed from the ground up to support multithreading. Threads are an addition to the Unix operating system, and as with much of Unix, there is not one standard threading model. The general task of implementing threads is complex, but using threads on Windows NT is considerably simpler than on Unix. The introduction of Microsoft Transaction Server (MTS) dramatically reduces the complexity of using threads. MTS provides a framework for apartment threading where individual instances of the same class can run on different threads. Developers can take advantage of this framework without explicitly creating threads in their code.
Windows NT provides a rich process-management environment, and unlike Windows 95 and Netware, it includes true preemptive multitasking of processes. An executing process can be cleanly terminated and prioritized to ensure responsiveness to user requests.
I would like to see support in Windows NT for the operating system to distribute processes and threads seamlessly across any machine on the network. Given this enhancement, a network of machines could operate as one virtual machine. In this virtual machine, you would achieve scalability by adding more machines to the network, and you would achieve reliability because of the redundancy between machines.
Databases, and in particular, data warehouses, are memory-intensive applications. The more data that can be loaded into memory the faster the response times. A process in Windows NT4 can currently access up to 2GB of RAM. This memory can be a combination of real and virtual memory. The recent release of Windows NT Server 4.0 Enterprise Edition will allow a process to access 3GB of RAM. The early 1998 release of Windows NT 5 promises to provide a process with 32GB of RAM. The 64-bit versions of Unix are currently ahead of Windows NT in terms of the amount of memory a process can access. A process in Digital's Unix running on an Alpha server can access up to 28GB of RAM. Microsoft appears to be waiting for the Intel "Merced" chip before it will provide a full 64-bit version of Windows NT. I expect to see a full 64-bit version of NT in 1999.
I would like support in Windows NT for distributed memory, whereby a process can use the memory of any machine on the network as if it were its own. This enhancement would improve scalability. If you prefix a memory reference with the name of the machine with the memory, a process would theoretically be able to access all the memory resources of the machines on the network.
Databases, and in particular, online transaction processing (OLTP) systems, are I/O intensive: the faster the I/O the better. Bypassing operating system buffering and accessing data on the raw physical device directly has always been popular for OLTP, especially on Unix. Windows NT supports this raw access to devices, but such raw access is always at a cost of ease of administration. Windows NT boosts I/O performance with asynchronous I/O, where the originator of an I/O request can continue executing, rather than waiting for its I/O request to be completed.
Windows NT supports fast disk arrays in RAID 0, 1, and 5 configurations. As a low-cost alternative to hardware RAID, NT also provides software RAID.
I would like to see Windows NT offer a distributed lock manager to prevent contention when multiple machines share the same disk array. Oracle implemented its own distributed lock manager at the application level on Solaris -- this service should be supported at the operating system level.
Database servers use sockets to provide the plumbing in client/server computing. Windows NT uses Windows Sockets for network programming. Windows Sockets is an open standard (controlled by the Winsock Group) with a public specification based on U.C. Berkeley Unix sockets. Windows Sockets provides a rich interface for low-level data transfer in applications. The Windows NT networking architecture was designed to be transport protocol-independent so that Windows NT could coexist in a Windows for Workgroups (NetBEUI), Mac (Appletalk), Netware (IPX/SPX), or Unix (TCP/IP) environment. Because of this architecture, Windows Sockets are also transport independent: NetBios Sockets, SPX Sockets, and TCP/IP Sockets are all accessed through the same WSock32 DLL.
Windows NT offers two cumulative levels of abstraction on top of sockets: Remote Procedure Calls (RPC) and DCOM. Both RPC and DCOM can use named pipes, NetBIOS, or Windows Sockets to communicate with remote systems. Much of the original design work for an RPC facility was started by Sun Microsystems Inc. This work was continued by the Open Software Foundation (OSF) as part of its overall Distributed Computing Environment (DCE) standard. The Microsoft RPC facility is compatible with the OSF/DCE-standard RPC. DCOM is a higher level of abstraction and supports interobject communication over RPC.
The Common Object Request Broker Architecture (CORBA) from the Object Management Group (OMG) is a competing standard to DCOM for distributed object computing. OMG is an industry coalition with more than 600 members, including Sun, Oracle, IBM, and Netscape. The CORBA DCOM debate raises a fundamental question: What is the most effective way to set industry standards?
Microsoft tends to implement technology that becomes a de facto standard (witness the Windows desktop). More recently, Microsoft has been investing in publishing its key technology as standards (ActiveX, COM, DCOM). Unix and CORBA are based on standards, but vendors try to differentiate themselves through enhancements to standards; vendor implementations are therefore often not compatible. Microsoft, with its monopolistic authority, owns its standards and acts in concert to ensure compatibility.
So what is better, a monopoly player whose technology is interoperable or a coalition whose intentions are sound but whose technology often does not interoperate?
Directory services provide a single point of administration for all resources, including files, peripheral devices, host connections, databases, users, security services, and network resources. (For more information on directory services, see David Linthicum's article "Finding Your Way" in the November 1997 issue of DBMS) Traditionally, applications, including databases, implemented their own directory system. When new employees join an organization they need accounts set up on the LAN, email, and various databases; the fragmentation of directory services creates an enormous administrative overhead. Windows NT has not been a trailblazer in directory services when compared to X.500, Netware Directory Services (NDS), and Banyan Systems Inc.'s StreetTalk. Microsoft intends to jump-start Windows NT into the enterprise-capable directory arena with Windows NT5. NT5's Active Directory is built on top of the Domain Naming System (DNS) and Lightweight Directory Access Protocol (LDAP) open standards. With LDAP, expect to be able to find objects in a relational database and assign users access to these objects.
With an integrated directory service, users will be able to log onto the network once and then access all the applications and databases to which they have been granted access. With integrated directory services, developers would no longer develop application-specific security models. Instead, they would embed calls to the integrated directory service. This would reduce both the development time for building applications and the administrative overhead of keeping applications running.
As with the adoption of standards-based directory services, Microsoft is moving away from proprietary technology by integrating the leading security standards for authentication and cryptography into NT.
Microsoft plans to include an implementation of MIT's Kerberos in Windows NT5. Kerberos will replace existing NT LAN Manager (NTLM) protocols. The major functional enhancement includes mutual authentication, where both the clients and servers must be authenticated to access services.
Microsoft has also developed a cryptography framework called CryptoAPI. CryptoAPI provides a common WIN32 API to cryptography service providers, so developers can write to this API and it will call the installed cryptography service provider. The type of cryptography used can be changed at runtime by installing different cryptography services. In addition to providing a high degree of flexibility, CryptoAPI also provides a solution to cryptography export problems.
These security enhancements should make Windows NT a candidate for providing secure transactions over the Internet and private networks. Database vendors are likely to take advantage of these new features in their NT products by replacing their own authentication with Windows NT authentication and by using the CryptoAPI to provide secure transactions.
The performance of databases needs to be monitored to determine system bottlenecks and fine-tune system and application performance. NT provides an API for logging performance information with the Performance Data Helper (PDH) interface. Performance data can be monitored with the Performance Monitor utility. Using this API, developers can add performance objects and counters (to the existing NT Executive objects and counters) to help tune performance while developing and debugging. After the application is complete and installed on target systems, the counters can help system administrators adjust configurable settings in the application. Performance Monitor integrates the monitoring of operating system and application performance. Oracle's Workgroup Server and Enterprise Server and Microsoft's SQL Server all use the performance monitoring APIs to record performance information.
NT5 expands integrated performance monitoring with Microsoft Management Console (MMC), previously code-named "Slate." MMC provides a console framework for management applications. MMC hosts administration tools as MMC Snap-Ins. Microsoft plans to make all existing Microsoft administration tools available as MMC Snap-Ins. A Snap-In SDK will also be made available for ISVs. The goal of the MMC initiative is to bring all the management tasks an administrator performs into a single integrated console. This goal is laudable and should reduce administration costs considerably -- if ISVs toe the line, that is.
One of the popular selling points of Windows NT is the similarity of its interface to Windows 3.x and Windows 95, reducing the learning curve for NT users, developers, and administrators. One thing I have always liked about Unix is its multiuser facilities and the ease of remote administration that these facilities support. Unix machines can be remotely administered over a low bandwidth using telnet and command-line utilities. With more bandwidth, Xwindow sessions can be run on the Unix host from a super-thin client. To perform most administration tasks on Windows NT you must have physical access to the console -- a significant restriction in distributed networks. Microsoft plans to add Windows-based terminal support to Windows NT as part of the "Hydra" initiative. Hydra uses multiuser technology licensed from Citrix Systems Inc. Using Hydra, a Windows-based terminal launches and runs Windows applications from the server. Microsoft is incorporating this technology into NT5. Another useful piece of technology on the horizon is the Windows Scripting Host (WSH), which will provide a shell scripting language based on Visual Basic within the operating system. The lack of a competitor to the Unix shell language in Windows NT has been an annoying omission.
The Win32 API provides functions to let you log information about events that can be used to diagnose problems after they have occurred. The events can be viewed using the Event Viewer utility. This API is used by Windows NT to log any hardware problems (bad disk sector), resource problems (low memory), and auditing information (user logs on, user access a file, and so on). Developers can also use this API to record event information about their applications. For example, a database application could record each user logging on, when a database is opened, and any errors encountered such as corruption in the database. This information can be valuable to a support person (or to the developer of the application).
Scalability refers to the capacity for things to perform better when you spend more money on them. NT provides excellent scalability bandwidth -- up to a point. You can start small with a desktop machine and grow big using a symmetric multiprocessing (SMP) machine. Windows NT doesn't compete at the really big hardware end; this is where IBM and (increasingly) Sun are focused. In my experience, big hardware is a niche market, because only 20 percent of applications need really big hardware. NT is eating away at the other 80 percent of applications, before it tackles the 20 percent.
Theoretically, Windows NT can support SMP with up to 32 processors. The license for the shrink-wrapped version of Windows NT is limited to four, but the license for the soon-to-be-released Windows NT Enterprise edition raises the limit to eight. Windows NT has been criticized for its lack of linear scalability beyond four processors. Currently, if a vendor wishes to support more than four processors, it must develop its own Hardware Abstraction Layer (HAL). The lack of linear scalability is not necessarily a technical limitation. The majority of vendors supplying SMP environments also supply Unix alternatives, so it is not in their best interest to invest in the development of HALs for lower-margin NT SMP machines.
An alternative to the big-box scalability approach is clustering. Windows NT is moving toward the clustering approach. Expect to see clusters of SMP machines running Windows NT, providing linear scalability and fault tolerance in the next 12 months. These SMP clusters will balance workload across the cluster and will scale by adding low-cost nodes to the cluster. There are currently some proprietary Windows NT clustering solutions available from Digital and Compaq. The Microsoft "Wolfpack" initiative provides an open clustering API for Windows NT upon which ISVs can build cluster-aware applications. The first release of Wolfpack technology will be with the Windows NT Enterprise edition, which will support failover in a two-node cluster; a multinode load-balanced cluster will be available in 1998.
Using scalable platforms for databases is highly desirable. Typically, databases grow in a linear fashion over time. Return on investment is protected and disruption minimized if the hardware supporting a database can grow with the database without having to move to a different platform.
The next 12 to 18 months will be very interesting for Windows NT. Windows NT4 Enterprise Edition was recently released. Windows NT5 is currently scheduled for early 1998. Windows NT is a viable platform for workgroup and departmental databases today, and the future releases should make NT a viable platform for enterprise databases. To help open up this market, Microsoft is lining up the VAR channels that traditionally provided the big hardware and professional services for enterprise databases.
The Orwellian prospect of Windows NT everywhere is rather like a world with only one automobile manufacturer selling only one car. Will the rest of the industry be reduced to manufacturing Windows NT accessories? A lack of competition may lead to the Word 97 phenomena, feature saturation, and lack of focus and innovation. This could open up the opportunity for a new generation of lean and mean distributed operating systems.
I expect the feature saturation of Windows NT to result in the development of a database so integrated with Windows NT that it becomes indistinguishable from the operating system. Microsoft's goal is to raise the following question: How can you justify buying a third-party Web server, transaction-processing monitor, email server, or database when one is included with the operating system? Assimilation of competition through feature saturation may be key.