fhttpd design notes
Concept
HTTP servers can perform some actions on requests by using
different kinds of application interface. Multiple approaches
are used in the implementations of this feature.
CGI
The most common,
implemented in almost all servers interface is CGI - Common Gateway
Interface. Basically a program is executed when request to some
URL arrives, with environment variables and command line
containing the request parameters, client's request body (if any)
is sent to stdin, and everything sent by that
program to stdout is copied to the client with or without.
a header.
While simple and easy to implement it has a
serious problem - CGI requires program to start every time when
request is received and exit after sending a response:
- It's inconvenient and takes large amount of resources.
- Constant restarting makes impossible to keep open
connections to some servers, such as SQL database servers.
- There is no way to pass the data between invocations of
the same program without using temporary files or external
servers.
- There is no way to pass the data between different
instances of the same program running simultaneously.
- There is no way to run the program on the host, different
from the one with the HTTP server itself.
- Standard doesn't define, what userid
should be used for a process, so most of implementations
of "bare" CGI don't provide enough
security for programs, unless they are setuid..
Also the lack of guaranteed Content-length
makes Keep-Alive impossible to implement without
additional parsing in the server (but then it creates problems
with "server push" unless the server tries to handle it somehow
- standard doesn't specify anything about that).
Alternatives to CGI
HTTP servers often provide some other way to handle requests,
which uses interpreters (Roxen) shared libraries (Apache,
Netscape, ...) or applications connected through sockets or
pipes (FastCGI standard,
implemented in some servers). Interpreters are slower than
compiled code, require programs to be written in their language
and impose limitations of what can be done through them. Shared
libraries are efficient, but when used for server modules require
not always convenient programming model, and make
security/reliability depend on all programs involved, what is
acceptable in some situations and can't work in others. FastCGI
uses processes, connected to the HTTP server the same way as
fhttpd does in its interface, but its handling of processes on
the remote hosts (server connects to the process in FastCGI
vs. process logs into the server in fhttpd process interface)
and protocol details differ. Also FastCGI doesn't allow a
program to bypass HTTP server and talk to the client directly.
Since FastCGI is now implemented in
some servers as modules, I am going to write a gateway between
fhttpd processes and FastCGI to support it in the same fashion
(with such concept similarities it won't take too much
programming effort and overhead when used). Interpreters can be
made running as processes in fhttpd, so that kind of
functionality can be provided, too, but, of course it won't be
compatible with interpreters in the server itself. Modules as
shared libraries aren't used in fhttpd.
fhttpd user processes modules
User process module is a process, running either locally and connected
to the HTTP server through anonymous pipes (ones, created by
pipe() system call) or AF_UNIX
sockets, or remotely, connected through
TCP/IP. Multiple instances of the same application may run
simultaneously, although the configuration may force
the server to limit the number of such instances and/or make
requests processing assumed to be synchronous or
asynchronous. If multiple instances are running, new requests are
sent to the process with the least number of pending
requests or only to an idle process, if the limit of one request per
process per time is set. If multiple processes have the
same number of pending requests (or that number is 0),
round-robin is used to distribute the load more evenly for the
case, when application instances are running on different CPUs
or hosts. If no processes are available, request is queued in
the HTTP server until process becomes available.
Alex Belits
Last modified: Wed Jun 23 04:27:22 PDT 1999