The Story of Forks
April 10, 2000
Apache, as you may know, is a very popular web server. So popular,
in fact, that as of March 2000 Apache is believed to power some 60%
of web sites on the Internet -- and, thank goodness for open source,
it's free to boot. What an age to be alive! A web server would be
an extremely simple thing if your site only ever attracted a single
visitor at a time. With 6 billion people on this planet, that's
rather unlikely. Instead, the web server must juggle and serve
a number of suitors simultaneously, not unlike a harried waitress
scurrying between restaurant patrons. Web servers in general employ
one of several schemes for handling incoming requests, some schemes
more efficient than others. Apache, in its current 1.x incarnation,
is what they call a pre-forking server. This does not mean
Apache is older than silverware ("the time before forks").
Rather, it means that the parent Apache process "spawns"
(like a demon) a number of children processes who lie in wait
anticipating an incoming connection. When a request comes in, and
one child is busy, another child handles that request. If all the
children are busy, Apache may birth more children depending on the
server's configuration, or -- when the maximum number of children
are born -- additional requests are queued and the end user must
wait for the next available child process.
Each child spawned takes up space and resources -- namely, memory
and possibly processing time (depending on what it's doing).
Ideally, Apache keeps just enough children alive to handle incoming
requests. If additional children must be spawned to handle a surge
of requests, Apache will ruthlessly kill them lest they lie
around forever idle, simply consuming resources. The world of
Apache is a brutal place.
How does all of this relate to Perl? A connection request arrives
at an Apache child process, and requests, for example, a
CGI script. The CGI process occurs
external to the Apache child, which means that the child must
fork a new process to launch the CGI script. In the case of
a CGI coded in Perl, the Perl interpreter must be launched since
Perl is not a compiled language. The interpreter is launched as a
separate process, it compiles and executes the Perl code, and
returns the results to the Apache child, who then passes them
along to the visitor. Works great, except for two problems: it's
slow, since the Perl script has to be re-interpreted every time it
is run, and it consumes even more memory, because the Perl
interpreter must be launched for each execution of the Perl script.
The above describes your standard garden variety CGI environment.
For sites with low traffic and/or low processing demands, CGI is
easy to implement and the costs are still reasonable (keep in mind
that "slow" in computer terms is still very, very fast
in human terms).Where the CGI model begins to break down is with
sites that must process more than several simultaneous requests
for Perl scripts, and those scripts perform a variety of activities
such as database queries. A web site with these needs will quickly
become bogged down by the sheer inefficiency of CGI, wasting
memory and leaving visitors frustrated with noticeable wait times.
Enter the Hero
One sunny (or cloudy, we just don't know) day, a bright fellow
named Doug MacEachern resolved to marry Perl and Apache, so that
rather than interacting as two foreign independent entities, the
two would be joined in holy matrimony, with the advantages
and both combined in union, able to tackle the world till
obsolescence do they part. With a knack for hacking, but perhaps
not such a gift for names, Doug names his new hybrid
mod_perl. Put more accurately, mod_perl is
an Apache module that integrates the Perl interpreter into the
Apache web server.
The benefits of this integration are twofold:
- Because the Perl interpreter is built into the Apache parent
process, Perl scripts can be executed much more quickly. At the
least, the Perl interpreter does not need to be launched for each
script invocation -- at best, depending on configuration, Perl
modules and/or scripts can be wholly or partially pre-compiled
and stored in memory. Our focus on mod_perl will be on how to
emphasize this advantage.
- Another benefit of Perl integration is that the Apache
server's internal workings are exposed to the Perl interpreter --
in short, this means that a Perl code can intervene at any stage
in request processing to take over, or re-implement, the way in
which processing stages are handled by Apache. This lends to a
great deal of customization of server behavior, but is admittedly
a more complex and obscure endeavor than enhancing script performance,
and we will not focus on this benefit of mod_perl in this short
series.
The Perl You Need to Know Special: Introduction to mod_perl
The Perl You Need to Know
Getting the Goods
|