ArsDigita Archives
 
 
   
 
spacer

Scalability, Three-Tiered Architectures, and Application Servers

by Philip Greenspun (philg@mit.edu)

ArsDigita : ArsDigita Systems Journal : One article


"Men are most apt to believe what they least understand."
-- Michael de Montaigne
Application servers for Web publishing are generally systems that let you write database-backed Web pages in Java.

The first problem with this idea is that Java, because it must be compiled, is usually a bad choice of programming language for Web services (see my book chapter on server-side programming).

The second problem with this idea is that, if what you really want to do is write some Java code that talks to data in your database, you can execute Java right inside of your RDBMS (Oracle 8.1, Informix 9.x). Java executing inside the database server's process is always going to have faster access to table data than Java running as a client. In fact, at least with Oracle on a Unix box, you could bind Port 80 to a program that would call a Java program running in the Oracle RDBMS. You don't even need a Web server, much less an application server. It is possible that you'll get higher performance and easier development by adding a thin-layer Web server like AOLserver or Microsoft's IIS/ASP, but certainly you can't get higher reliability by adding a bunch of extra programs and computers to a system that need only rely on one program and one computer.

This document works through some of these issues in greater detail, pointing out the grievous flaws in Netscape Application Server (formerly "Kiva") and explaining the situations in which Oracle Application Server is useful.

There is no scalability problem

My friend Jin and I spent some spare evenings building http://www.scorecard.org for the Environmental Defense Fund. When a user types in his zip code, the server shows him a map of the factories near his house. Clicking on a factory will list the chemicals released. Clicking on a chemical will list its health effects. The site was featured on ABC World News, in Newsweek, in the New York Times, on CNN, and was a Yahoo Pick of the Week. Every single page on the site is generated on-the-fly by querying a relational database management system (RDBMS). Some pages require five SQL queries. Each page requires at least one. The site gets about 30 requests/second at peaks (on days when traffic is over 500,000 hits). There are only a handful of sites on the Internet that serve a larger number of db-backed pages.

Our hardware for this monstrously popular site? A Sun Microsystems SPARC Ultra 2 pizza box Unix machine, built in 1996. Its dual 167-MHz CPUs would be laughed at by the average Quake-playing 10-year-old. The CPUs sit idle 80% of the time. The disks sit idle most of the time, partly because I spent $4,000 on enough RAM to hold the entire 750 MB data set. Oh yes, the machine also serves a few hundred thousand hits/day for other customers of arsdigita.com and runs the street cleaning and birthday reminder services that we built.

If we tarred up the site and moved it to a mid-range Unix server such as the HP K460 that sits behind http://photo.net, we could probably serve at least 5 million hits/day. If we moved it to the highest-end HP server, I'd bet that we could get close to the 100-million hit/day mark that sites like Yahoo serve.

Why has "scalable" become the buzzword du jour? People get burned because they do stupid things. They connect their Web server to their RDBMS via CGI, thus forcing the machine to work 10-20 times as hard for no good reason. They run Windows NT. They run some unproven junkware/middleware that came in an attractive box. Services get wedged and they run out and buy another dozen (or thousand, as with www.microsoft.com) physical computer systems. Now that they have a whole machine room full of hardware, they know that they can't keep it all running simultaneously so they look for software to yoke it all together somehow such that the death of one machine won't be noticed.

How do my friends and I avoid scalability problems? We know that we're stupid. We run the Oracle 8 RDBMS like the rest of the world and don't try to figure out if some new competitor's hype has any relationship to reality. We talk to the RDBMS via AOLserver, which has been doing connection pooling from a Tcl API since 1995. So we get the safety and software development ease of Perl/CGI but the computer never has to fork a CGI process and the database connections are shared among the scripts. We've served roughly 1 billion hits with AOLserver so we're pretty sure that it works. Linux and NT get magazine writers excited, but we run the same commercial versions of Unix on which the Fortune 500 relies for its enterprise computing.

There is no reliability/availability problem

The Internet is not perfectly reliable. Users can't expect to get IP connectivity from their own internet service provider (ISP). If their ISP is down then they can't get to your site or anyone else's so they generally won't actually bet their life on your server being reachable. The main reason to have high availability is ego. You don't want people to think that you're incompetent, e.g., a friend of mine said he wouldn't buy a ticket from the United Airlines Web site because it was so unreliable at the front end that he figured they'd have screwed up the back end to the point that his reservation would never actually get made.

But what's reasonable and realistic? It will cost you a fortune in extra hardware, software, and administration time to shoot for 24x7x365 uptime. And, in the end, you will never achieve it. Nobody in the history of computing has ever achieved 100% uptime. So would you rather have three Web services that are down for eight scheduled hours/year and eight unscheduled hours or one service that wasn't supposed to ever go down but in fact is unavailable for four hours/year?

Why do people think that availability is such a problem? Again, they are mostly applying band-aids to decisions that were risky on the face. If you read a Sun Microsystems marketing brochure, you might be ready to run out and buy a 64-processor E10000. But after watching one of their contract service guys try to fix a desktop Sun system, a wise person would probably think twice about relying on the latest, greatest, and most complex Sun server. Does that mean you can't buy a big fancy machine? Of course not. I'm not nervous about my personal HP K460 with its 36 disk drives on 6 SCSI chains, 4 CPUs, 4 GB of RAM, and 2 network cards. Why not? All the HP service engineers that I've met in the Boston area are wizards on both the hardware and software. How about Windows NT? Do you personally know anyone serving 10 million hits/day, every day, from an NT box parked where they don't have physical access? If not, why risk your service on NT?

Suppose I could get a couple more HP K460s and the HP ServiceGuard software and maybe some round-robin routers, all for free? Would it make my site more reliable? I don't think so. I don't have enough time or money to figure out how to install, configure, and maintain all that stuff. I wouldn't even have time to document the configuration so that if I were on vacation and the site failed, someone else could bring it back up.

The Three-Tiered Architecture

State of the Three Lies (John Harvard).  Harvard Yard. The latest rage in business data processing is the three-tiered architecture. Everyone seems to have one but they don't always agree on what the tiers are. Here is one possible set of tiers:
  1. relational database management system (RDBMS)
  2. business logic (if statements)
  3. presentation layer (formatting)
Web software vendors sometimes try to convince people that the three standard tiers are the following:
  1. relational database management system (RDBMS)
  2. application server
  3. web server
After accepting this idea, then clearly you need to buy an application server. You have Oracle. You have a Web server program. But you have no software at all for one critical tier!

Before trashing the idea of the application server, let me trash the idea of the three-tiered architecture for Web services.

They can afford it but you can't

A business may spend hundreds of millions of dollars/year on its core information systems. They can devote a large professional staff to making each tier fast and reliable. A Web publisher is always short on database administration and system administration talent. Do you have time to install the Oracle replication software? The expertise and time to figure out which of your tables need to be replicated? The expertise and time to configure the Oracle software to replicate transactions on those tables? The procedures and organizational discipline to make sure that the replication strategy gets documented and kept up to date as the data model evolves?

They have time and you don't (or "the three-yeared architecture")

A business will spend several years developing a new application. It may be tested for 6 months or a year before being gradually deployed. A business has design time to carefully spread responsibilities across machines, people, and tiers. A business has testing time to make sure that disaster recovery produres are documented and verified. A business usually has at least some hours every day when the whole system can be taken off-line and rebuilt.

A Web service usually has to go from conception to launch in less than one year, preferably closer to 6 months. If you can't do that, the idea that seemed so clever will probably already have been done by six other people. It might be nice to break everything up into abstractions, layers, objects, and protocols operating on three redundant tiers. But what if it takes you three years? How can you be sure that the publishing or business needs won't change radically six months after you launch and get some real user experience?

Gentle-slope Programming Systems

The most popular server-side programming systems are those that give developers a gentle slope into Real Software Development. Consider the hugely popular Microsoft ASP system. An HTML document is a legal ASP program. It doesn't compute anything and its output never changes, but it is legal in the IIS/ASP system. When the Web developer gets a little more advanced, he or she can start putting in some magic escape tags and embedding simple Visual Basic code fragments in the ASP documents. Six months later, the developer can't figure out how to meet the site's publishing goals and calls in a programmer. The programmer builds a COM object whose methods the developer can invoke from the embedded VB. When the service isn't running so well on one computer, the publishers call the programmer back to build a DCOM object.

The end result of this process is laughably unreliable since NT isn't truly crash-proof, IIS doesn't work so well, ASP is a bit flakey, and the COM/DCOM stuff isn't reliable or fast. But the process by which the publisher got to this end result is perfectly reasonable.

America Online was enamored of the process but not of the resulting reliability. So they added "ADP" pages to AOLserver (http://www.aolserver.com). An HTML document is a legal ADP program. Magic escape codes allow the developer to insert a bit of Tcl. When the developer needs some powerful encapsulated programs, a programmer can build Tcl modules that are loaded at server start-up. If that isn't sufficient, a programmer can write a shared library in C and make it available to Tcl programs AOLserver-wide. The C code has access to the full range of services on a Unix machine.

In the Apache world, people achieve the same results with a mix of traditional Perl CGI scripts and plug-in modules. The approach that is closest to ASP in spirit is the PHP hypertext preprocessor (see http://www.php.net).

A cleaner system than any of the above is Meta-HTML (http://www.metahtml.com), which extends HTML syntax and semantics into a powerful programming language. This means that the developer isn't forced to constantly bounce between Tcl and HTML syntax, for example.

Curl is one of the most thoughtfully developed gentle-slope programming systems for the Web. It was built by computer systems researchers at MIT and works for both server- and client-side Web programming (see http://curl.lcs.mit.edu/curl/wwwpaper.html). Curl is currently being commercialized by a venture capital-backed company.

Almost all of the implementations of these gentle-slope languages provide the developer with a rapid-prototyping environment. To make a change to a program, the developer need only edit a file in the Unix file system. The next time a URL is requested, the new version of the program is used. Note that the earliest Web server scripting system, Perl/CGI, has this property.

Life with an Application Server (KIVA/Netscape)

Last year, I worked with a group of programmers who were rebuilding a
site that had originally been knocked out by a consultant in a couple of
months.  It was some big, nasty Perl scripts talking to an Oracle RDBMS.
Homely yet functional.

The new team of programmers loved Kiva.  Actually they bristled at the
title "programmer" and would point out that they were in fact "software
engineers." I don't mention that as a point of ridicule, but rather to
illustrate their perspective. They dealt in high-level concepts on
prototypes that remained prototypical right up until the VC funding ran
out.

-- email from a Web technologist 
This e-mail message was about the Netscape Application Server, a product originally named "Kiva Enterprise Server" that Netscape purchased in early 1998.

After you pay $35,000 per CPU, you can add a dynamic page to your Web site by following these easy steps, as outlined by one of my co-developers:

1. write your java code in foo.java

2. compile with:
/usr/local/kds/jdk1.1.5/bin/javac -g -classpath "/usr/local/kds/jdk1.1.5/lib/classes.zip:/u sr/local/kds/classes/java/SWING.JAR:/usr/local/kds/classes/java/kfcjdk11.jar:/usr/local/kds /classes/java/kdsjdk11.jar:/usr/local/kds/classes/java/ktjdk11.jar::/usr/local/kds/jdk1.1.5 /classes:/usr/local/kds/jdk1.1.5/lib/classes.jar:/usr/local/kds/jdk1.1.5/lib/rt.jar:/usr/lo cal/kds/jdk1.1.5/lib/i18n.jar:/usr/local/kds/jdk1.1.5/lib/classes.zip:/usr/local/kds/jdk1.1 .5/classes:/usr/local/kds/jdk1.1.5/lib/classes.jar:/usr/local/kds/jdk1.1.5/lib/rt.jar:/usr/ local/kds/jdk1.1.5/lib/i18n.jar:/usr/local/kds/jdk1.1.5/lib/classes.zip::/usr/local/kds/APP S" "/usr/local/kds/APPS/yourappname/foo.java"
3. once sucessfullly compiled, create a GUID with:
/usr/local/kds/bin/kguidgen

4. paste a copy of this GUID into your code.

5. edit yourappname.gxr and add an entry for foo, with the GUID

6. Register the applogic with /usr/local/kds/bin/kreg yourappname.gxr
if you like, or for some reason this doesn't work, you can run kreg
without arguments in interactive mode.

7. you can access the guid at
http://yourserver.com/cgi-bin/gx.cgi/AppLogic+foo

The logs are in /usr/local/kds/log/

I think the kjsdev.log usually has the most interesting information.
If your applogic fails at run time, the browser will return "document
contains no data"
Following these steps, it took us two weeks to port an application that had taken a day to write in AOLserver Tcl. That's not counting the time it took to get some paper manuals FEDEXed because the documentation wasn't available on the Web in HTML format. The 2.0 software was almost laughably unreliable, with minor Java programming errors in a single script capable of bringing all Web services to a halt. But even if the Netscape Application Server had worked as advertised, it would have been an extremely painful development environment. Here's a Tcl string into which we are substituting the values of two variables:
Manage $domain ($pretty_name)
In Kiva's template language, you have
Manage %gx type=cell id=domain%%/gx% (%gx type=cell id=pretty_name%%/gx%)
It turns out that string assembly in any system that uses Java is painful because Java's parser is so weak that you can't have a string literal containing newlines. Consider the simple error message fragment in Tcl:
    append exception_text "<li>Your email addresss doesn't look right to us.  We need your full
Internet address, something like one of the following:

<code>
<ul>
<li>Joe.Smith@att.com
<li>student73@cs.stateu.edu
<li>francois@unique.fr
</ul>
</code>
"
In Java, that turns into the following rich source of compiler complaints:
 exception_text += "<li>Your email addresss doesn't look right to us.  We need your full\n"+
"Internet address, something like one of the following:\n\n"+
"<code>\n"+
"<ul>\n"+
"<li>Joe.Smith@att.com\n"+
"<li>student73@cs.stateu.edu\n"+
"<li>francois@unique.fr\n"+
"</ul>\n"+
"</code>\n";
Passing sets of variables from the user's browser through to Oracle is much more painful in Kiva. Here's the AOLserver Tcl script that allows a contest administrator to add a column to the entry table. Note that this is a 7-line program plus two "return a page" statements.
set_form_variables

set db [ns_conn db $conn]

set table_name [database_to_tcl_string $db "select entrants_table_name from contest_domains where domain = '$QQdomain'"]

set alter_sql "alter table $table_name add column $column_actual_name $column_type"

set insert_sql "insert into contest_extra_columns (domain, column_pretty_name, column_actual_name, column_type)
values
( '$QQdomain', '$QQcolumn_pretty_name', '$QQcolumn_actual_name','$QQcolumn_type')"

if [catch { ns_db dml $db $alter_sql
            ns_db dml $db $insert_sql }  errmsg] {
    # print error message
    # ...
} else {
    # database stuff went OK
    # print confirm page..."}
Note that variables that came from the previous form like $column_actual_name and $QQcolumn_pretty_name (with any apostrophes quoted) are available simply because we called the Tcl procedure set_form_variables. The Kiva code shows just how much pain this Tcl magic was saving us.
package contest;

import java.lang.*;
import java.util.*;
import java.text.*;
import java.io.*;

import com.kivasoft.applogic.*;
import com.kivasoft.types.*;
import com.kivasoft.util.*;
import com.kivasoft.*;

/*GUID {C8CCC3C0-535C-1510-AD41-0800208F129A} */

public class ContestAddCustomColumn2 extends ContestAppLogic
{

  public int execute() {

    // grab variables from the previous form
    String domain = valIn.getValString("domain");
    String column_pretty_name = valIn.getValString("column_pretty_name");
    String column_actual_name = valIn.getValString("column_actual_name");
    String column_type = valIn.getValString("column_type");

    com.kivasoft.IDataConn conn = openDatabase();

    // four lines to replace one line of Tcl (database_to_tcl_string...)
    IQuery domain_info_query = createQuery();
    domain_info_query.setSQL("select entrants_table_name from contest_domains where domain = '"+domain+"'");
    IResultSet domain_info_rs = conn.executeQuery(0, domain_info_query, null, null);
    String entrants_table_name = domain_info_rs.getValueString(domain_info_rs.getColumnOrdinal("entrants_table_name"));

    String alter_sql = "alter table "+entrants_table_name+" add ("+column_actual_name+" "+column_type+")";
    String insert_sql = "insert into contest_extra_columns (domain, column_pretty_name, column_actual_name, column_type)
	values (:domain, :cpn, :can, :ct)";
    
    IQuery alter = createQuery();
    alter.setSQL(alter_sql);
    IResultSet ignore_this_rs = conn.executeQuery(0, alter, null, null);

    IValList insertValList = GX.CreateValList();

    // set some substitution variables; note that Kiva 
    // delivers garbage to Oracle if you include underscores
    // in the bind variable names
    insertValList.setValString(":domain",domain);
    insertValList.setValString(":cpn",column_pretty_name);
    insertValList.setValString(":can",column_actual_name);
    insertValList.setValString(":ct",column_type);

    IQuery insert = createQuery();
    insert.setSQL(insert_sql);
    IPreparedQuery insert_prepared_query = conn.prepareQuery(0, insert, null, null);
    // here we're finally able to do the insert, something that took one
    // line of Tcl
    IResultSet ignore_this_rs_again = insert_prepared_query.execute(0, insertValList, null, null);

    TemplateMapBasic map = new TemplateMapBasic();

    map.putString("system_name",systemName());
    map.putString("system_owner",systemOwner());
    map.putString("alter_sql",alter_sql);
    map.putString("column_actual_name",column_actual_name);
    map.putString("column_pretty_name",column_pretty_name);
    map.putString("entrants_table_name",entrants_table_name);

    // this last bit will substitute all the preceding variables into a
    // template (a file that we separately maintain)
    return evalTemplate(getDocumentRoot()+"ContestAddCustomColumn2.html", (ITemplateData) null, map);
    }
}
Now we have more than 40 lines of Java code. But it is so much more reliable than the old Tcl, isn't it? Actually it is less reliable. My Tcl program checks for errors in executing the database ALTER TABLE and INSERT statements (note for db nerds: these are not bundled together in a transaction because DDL statements such as ALTER TABLE cannot be rolled back). The Java program does not. Why not? I was too worn out from writing all the extra lines and fighting Kiva bugs.

Now that we've covered the day-to-day pain of working with Kiva/Netscape Application Server, let's look at some of its more pervasive shortcomings.

Become a prisoner

I personally find it rather painful to write a database-backed Web service in any compiled language, including Java. However, if I'm going to spend time, money, and effort laboriously coding Java talking to an RDBMS on one side and the Web on the other, it is insane not to be using the standard Java libraries.

If I write a Java program using the Servlet API to talk to the Web and the JDBC API to talk to an RDBMS, then I can run my program without modification on sites backed by the following Web servers: AOLserver, Apache, Lotus Domino, Microsoft IIS, Netscape Enterprise, Sun Java Web Server, Zeus (and about 20 more, according to http://jserv.javasoft.com/products/java-server/servlets/environments.html).

If I write a Java program to the Kiva/Netscape API, then I can run my program on a system with the $35,000/CPU Netscape Application Server. Period.

Half-baked ideas

Suppose you're going to send an RDBMS a bunch of SQL statements of the form
select * from users where user_id = 6752
Everything in the query is always the same except for the number at the end, which will change depending on which user is grabbing a page. In ancient times, the RDBMS vendors decided that the syntax for putting in SQL with "bind variables" should be
select * from users where user_id = :1
Before your program asks the RDBMS to execute this query, it is supposed to tell the RDBMS what the value of ":1" will be. If you have a bunch of bind variables, you end up with ":17" in your SQL and it becomes rather ugly. As far back as I can remember, Oracle at least would let you use bind variables like ":user_id". And if you read The SQL Standard (Date and Darwen 1997; formerly "the red book" but now with a blue cover) you will find bind variables like ":user_id". But Kiva decided, perhaps for superstitious reasons, to write a little translator that would map each symbolic variable name into a numeric name. However, the Kiva documentation doesn't say what acceptable bind variable names are. It turns out that underscore, though a legal character in an SQL column name, a Java variable name, or a Tcl variable name (from which I was adapting the code), does not work in this little ad hoc Kiva subsystem. My program failed. A prospective user would get "document contains no data". The Kiva error log would fill up:
[03/11/98 02:10:23:4] warning: ORCL-048: select * from users where user_id = :1_id
When I changed the bind variable from ":user_id" to ":userid", the page started working again.

Another half-thought-through idea in Kiva is that data from browser cookies and data from HTML form variables should come in via the same programming interface. Unfortunately, that means if a cookie and a form variable have the same name, you'll only get the value of one of them.

I don't want to go on record as saying that Tcl and Visual Basic are the world's greatest computer languages. However, they are real computer languages whose syntax and semantics are well-understood and thoroughly documented. Ditto for Java, the Servlet API, and the JDBC API. With Kiva/Netscape Application Server, it really isn't clear what the program is even supposed to do.

Why all of this pain is supposed to be worthwhile

Why would anyone suffer with Netscape Application Server? Managers like the brochure because it promises great scalability and reliability. You can have 9 physical servers sharing the load and if one dies, the other 8 continue serving users. This is a wonderful little fantasy world but back in the ugly real world, it turns out that most Web publishers barely have the resources and experience to keep a single central RDBMS running. They certainly aren't capable of maintaining replicated RDBMS installations. If their single RDBMS server gets overloaded, the site will slow down. If their single RDBMS server dies, the site dies. No additional software or hardware can do anymore more than reduce the site reliability. These publishers would be much better off simply running a lightweight Web server program on top of their RDBMS server computer.

Netscape Application Server isn't just a way to talk to your RDBMS, though. Rather than have the RDBMS manage a user's session state, you can ask the cluster of application servers to do it for you. Because the load-balancing features of the application server mean that a user may be bounced from one physical machine to another, all of the user's session state must be kept simultaneously up to date on every physical machine running the application server. This is the well-known problem of keeping a consistent replicated database of dynamic information. A wise Web publisher would not trust Oracle 8 to do this job. After all, Oracle Corporation only has 20 years of experience in building this kind of system (through 8 versions). They only have $7 billion in revenue and 30,000 employees dedicated to making it reliable. Banks and insurance companies rely on Oracle's expertise. But that's not good enough for your Web site. Instead, you want the 2.0 version of a product built by a small start-up company. The fact that a customer's Java syntax error can crash the entire product shouldn't dissuade you from trusting the Kiva programmers to have solved this difficult problem correctly.

Assuming you believe that the Kiva programmers figured out how to do replication correctly, remember that the servers have to communicate session state amongst themselves over the network. You have to configure and administer this communication. You have to make sure that the open ports used for this communication don't become a security risk.

How is this security risk managed in practice? My friends who run Kiva never got this far. They were never able to get Kiva to run on more than one physical machine at a time. In fact, they had to restrict Kiva on that machine to a single thread. So all of the users of their dynamic Web content must line up single-file. If one of the users were to ask for a page that required Oracle to spend 30 seconds sweeping through some big tables, all of the other users would be staring at blank screens for 30 seconds.

Am I the only one?

Here's what Timothy Dyck had to say in the June 22, 1998 edition of PC Week:
Once, after nearly an hour of continuous 100 percent CPU utilization, every Sun Java VM crashed so badly we couldn't even kill them using the Solaris "kill -9" command. With the Java VMs continuing to use nearly 100 percent of the server's CPU resources, we had to reboot even the (normally) exceptionally stable Solaris system.

The bottom line is that even a few crashes per hour aren't acceptable for mission-critical applications. After all the tests, we faced an inescapable fact: Java VMs and Java database drivers just aren't ready for the demands of high-load, production environments.

The major Java problems we observed under both Sapphire/Web and Netscape Application Server were crashes in Java threading code (exceptions in java.lang.thread), which we solved by keeping thread counts below five per virtual machine; unstable Oracle JDBC (Java Database Connectivity) drivers, particularly when handling date and time values (which we mostly solved by getting Oracle Corp.'s as yet not publicly posted 8.0.4.2.0 JDBC drivers and sticking with Oracle's type 4 all-Java drivers, which proved more stable and virtually the same speed as the company's type 2 mixed Java/C drivers); and large memory leaks in the JavaSoft VMs that caused them to continuously consume RAM during the test interval (which we solved by equipping all servers with at least 1GB of RAM each and manually restarting all the Java VMs between each test run).

Despite having support from vendor experts, Dyck was unable to ever support more than 10 simultaneous users using Java pages in Netscape Application Server. What kind of a machine was he using? Just a little SPARC E4000 with 8 CPUs... (this is a $200,000 box)

Something nice about Kiva/NAS

One potentially useful implication of using Netscape Application Server is that there is a hard separation between people who do design (build templates) and those who write programs to query the database (build Java AppLogics). Generally people can only argue about things that they understand. Thus a bunch of business executives will approve a $1 billion nuclear power plant without any debate but argue for hours about whether the company should spend $500 on a party. They've all thrown parties before so they have an idea that maybe paper cups can be purchased for less than $2.75/100. But none of them have any personal experience purchasing cooling towers.

Analogously, if you show a super-hairy transactional Web service to a bunch of folks in positions of power at a big company, they aren't going to say "I think you would get better linguistics performance if you kept a denormalized copy of the data in the Unix file system and indexed it with PLS". Nor are they likely to opine that "You should probably upgrade to Solaris 2.6 because Sun rewrote the TCP stack to better handle 100+ simultaneously threads." Neither will they look at your SQL queries and say "you could clean this up by using the Oracle tree extensions; look up CONNECT BY in the manual."

What they will do is say "I think that page should be a lighter shade of mauve". In a Kiva-backed site, this can be done by a graphic designer editing a template and the Java programmer need never be aware of the change. Of course, the graphic designer will have to be fairly formally minded and understand that chunks of the form %gx type=cell id=domain%%/gx% must be preserved.

Do you need to pay $35,000/CPU to get this kind of separation of "business logic" from presentation? No. You can download AOLserver for free and send your staff the following:

To: Web Developers

I want you to put all the SQL queries into Tcl functions that get loaded
at server start-up time.  The graphic designers are to build ADP pages
that call a Tcl procedure which will set a bunch of local variables with
values from the database.  They are then to stick <%=$variable_name=> in
the ADP page wherever they want one of the variables to appear.

Alternatively, write .tcl scripts that implement the business logic and,
after stuffing a bunch of local vars, call ns_adp_parse to drag in 
the ADP created by the graphic designer.
Personally I find ADP syntax (copied by the AOLserver guys from Microsoft's IIS/ASP) to be cleaner than Kiva's. Furthermore, if there are parts of your Web site that don't have elaborate presentation, e.g., admin pages, you can just have the programmers code them up using standard AOLserver .tcl or .adp style (where the queries are mixed in with HTML).

Similarly, if you've got Windows NT you can just use Active Server Pages with a similar directive to the developers: put the SQL queries in a COM object and call it at the top of an ASP page; then reference the values returned within the HTML. Again, you save $35,000/CPU.

Finally, you can use the Web standards to separate design from presentation. With the 4.x browsers, it is possible to pack a surprising amount of design into a cascading style sheet. If your graphic designers are satisfied with this level of power, your dynamic pages can pump out rat-simple HTML.

From my own experience, some kind of templating discipline is useful on about 25% of the pages in a typical transactional site. Which 25%? The pages that are viewed by the public (and hence get extensively designed and redesigned) and also require at least a screen or two of procedure language (e.g., Tcl) statements or SQL. Certainly templating is merely an annoyance when building admin pages, which are almost always plain text. Maybe that's why I find Kiva so annoying; there are usually a roughly equal number of admin and user pages on the sites that I build.

The Bottom Line on Kiva

Assuming you can muster the programming resources and time necessarily to build an operational site in the Kiva API, does it matter that a Netscape Application Server site won't be the ultimate in reliability or performance? Perhaps not. Search engines such as AltaVista don't bother to index URLs of the form "http://yourserver.com/cgi-bin/gx.cgi/AppLogic+foo". So perhaps nobody will see your site.

One special case where an application server might be useful

If you have a huge investment in applications developed in Oracle Forms or Oracle Developer/2000, then you might want to look into using Oracle Application Server (OAS). In theory you can push a button in Oracle Forms and have your bad old client/server system turned into a shiny new (albeit clunky) Web application served via OAS. It might be nice for intranets.

Unlike Kiva/NAS, OAS is essentially a stateless system, thus eliminating a whole class of potential bugs.

Summary

The Web is fairly young. However, that is not a reason to abandon sound engineering principles. You don't want a complicated untested system sitting between your users and your database. You don't want to be forced into developing dynamic documents with a language requiring compilation.

Maybe you're having second thoughts at this point. You're going to be doing lots of transactions. Your idea is so brilliant that you're going to have half the world using your site. Maybe a simple Web service layer on top of a proven RDBMS is good enough for Greenspun and his pathetic 500,000 hit per day personal site, but you're building something serious here.

Two words of advice: "America Online".

As of this writing (July 1998), AOL has 11 million members. They have a bunch of Unix machines running a replicated Sybase. They interface this replicated Sybase to the Web via a very simple stateless program: AOLserver.

November 1999 Update

More than a year has elapsed since I wrote this article. In the world of broken computer systems, this is scarcely long enough to fix a bug. In the world of Internet business, though, this is enough time for America Online to have bought Netscape and forged an alliance with Sun. Did this result in a glorious flowering of software genius? Enough to enable Netscape's programmers to turn their application server into something useful? Well, let's check out what the snazzy new webcenters.netscape.com site is running...

telnet webcenters.netscape.com 80
Trying...
Connected to webcenters.netscape.com.
Escape character is '^]'.
HEAD / HTTP/1.0

HTTP/1.0 302 Found
Location: http://home.netscape.com
MIME-Version: 1.0
Date: Thu, 04 Nov 1999 06:23:02 GMT
Server: NaviServer/2.0 AOLserver/2.3.3
Content-Type: text/html
Content-Length: 319

In explaining some of this mess to MIT students in our one-semester course in software engineering for Web applications, it occurred to me that I never explained how three-tiered architectures and application servers came to be useful in large corporations to begin with.

Imagine that a large bank has a checking account system built in the 1960s on an IBM mainframe. They also have a Visa card system built in the 1980s on a Unix machine running the Informix relational database management system. They have a call-center management system built in the 1990s running on a Unix machine running the Oracle 7.3 RDBMS. Suppose that the bank wants to offer the following service: Charlie Consumer calls up the 800 number and wants to pay his Visa bill directly from his checking account.

The bank's IT staff is extremely risk-averse and doesn't want anyone to touch the guts of any of the three existing systems. So they hire a C programmer to write a custom program that will talk to the call center's Oracle database to figure out which consumers want to transfer money, then talk to the Informix- and mainframe-based systems to actually move the funds. This was called an application server because it contained the logic necessary to enable the new application, i.e., moving money from checking to credit card.

If the bank were starting from scratch would they build things this way? Of course not! They'd put everything into one big Oracle database on a Unix machine and they wouldn't need application servers. The checking-to-credit transfer could be accomplished with a tiny PL/SQL or Java program running inside Oracle.

The strange thing is that most Web service operators are in exactly the position that all the world's large companies would love to be in. The Web services have all their relevant info collected in one big relational database management system. So they don't have the pernicious systems integration problems that Fortune 500 companies do. Yet apparently at least a few of the technology folks at Web startups are sufficiently confused that they buy willingly into the same kind of computer systems complexity that the Fortune 500 are trying desperately to escape.

spacer