Webserver Notes
Win32 Apache Configuration
Apache is a web server. This is a program that runs in the background
on a web-server (or your PC) and listens out for HTTP
requests from browsers.
When it gets a request it retrieves the requested page from wherever it is
stored and returns it to the browser via the HTTP protocol.
There are many websites out
there that deal specifically with Apache and its detailed usage and
configuration. Probably the very best one can be found at: »http://www.apache.org
What follows is a simple "Idiot's Guide to Win32
Apache Setup" and should get a basic webserver up and
running on your PC according to your requirements.
Download and Install Apache 1.3.33 (Win32)
The latest version of Apache is 2.0.52, however you may
prefer a slightly older but stable and well supported for
win32 version 1.3.33 which is freely downloadable from: »http://httpd.apache.org/download.cgi To run it under
Windows you will need to locate the Win32 Binary (Self
extracting): apache_1.3.33-win32-x86-no_src.exe
This
is a 4.91Mb download and installs using the
Microsoft Installer that comes with Windows. Once the file has
downloaded double click it and the rest should be plain sailing.
Test Running Apache
Right!
It is assumed that you have followed all of the above and have completed
the actual software installation, Apache is now installed to
c:\apache
on your PC.
(I assume that if you are smart enough to
put it somewhere else you are also smart enough to figure out how to
correct the example paths I have used here.)
Your first task is of course to check that it works...
The win32 Apache installation can be started and controlled from the MS-DOS
command prompt. Either open a DOS window (or hit [START]
-> [RUN]) and type: c:\apache\apache
Apache will then start, the DOS window will remain open as long as Apache is running which of course it must be all the time that you require its services!
Open your web browser and into the address bar enter the following URL:
http://127.0.0.1/
This is the so-called 'loopback' address, to any
machine, including your PC, this means 'me, myself,here'!
Your Apache server should also answer to: http://localhost/
And if all is well you will see the default Apache homepage that came with the installation. Note also that this comes complete with links to the Apache manual which came with the installation. Read it! It's very comprehensive and extremely informative.
The document root (or top-level) directory for Apache in its
default installation will be c:\apache\htdocs
, the default
introduction page that you are seeing in your browser is the index.html
file from that directory. If you make a minor alteration to the file, save it
and then refresh your browser you should see your changes. (And then change
it back!)
To shut Apache down return to the DOS window in which Apache is running and hit Ctrl+C, this will cause Apache to terminate cleanly which you must do before you shut down your PC.
If Apache does not start first time you will need to look in the
manual directly to find the cause of the problem and a solution.
Go to the c:\apache\htdocs\manual
directory (or
paste that path directly into the address bar in your browser) and
look for the index.html
file, double click on it and the
documentation will launch itself
in your web browser.
Note that you are browsing your file system, not a
website, Apache is not doing anything. Use the documentation to diagnose
the problem, consult the »Apache website or »search the 'net for answers.
Apache Configuration - (Safety!)
Your second, and most important task before you change anything, is to locate the
c:/apache/conf/httpd.conf
file and create a copy alongside it
called httpd.conf.default
or similar. This is a basic
safeguard, the httpd.conf file is quite complex and not somewhere that
you really want to make a mistake! Until you are sure that you know what
you are doing you are advised to make one change at a time, checking that
it works before taking a backup copy and proceeding with the next change.
Safe working practices and all that!
The httpd.conf file is quite long, however most of this is comments, the whole script is well documented from within, a bit cryptic for the first time user but once you understand how Apache works, and more importantly how the httpd.conf file is structured, it is very nicely put together and highly configurable.
One very useful feature of Apache is its 'self-test'.
This is invoked by opening a DOS window and starting Apache with the
command: c:\apache\apache -t
('t' for test!)
Apache will run through the httpd.conf
file and check its configuration
and either report an error or return an OK. This is a very useful feature
which you are strongly advised to use!
Note! Apache reads the httpd.conf
once on
start-up. If you make changes to the configuration they will not take effect
until you stop and restart Apache. Try the command:
c:\apache\apache -k restart
which should have the desired effect
of causing Apache to cleanly halt and then restart.
Take an initial look at the httpd.conf
file and you will see
that it is divided into a general config with a number of distinct
'Directory' and 'IfModule' blocks or
'directives' which are delimited by HTML-like tags and
which may be nested thus:
... config options ...
</Directory>
<IfModule ***>
... config options ...
<Directory ***>
... config options ...
</Directory>
... config options ...
</IfModule>
Generally speaking the <Directory [path]>
directives
specify the configuration for that particular directory and any
sub-directories below it. <IfModule [module_name]>
directives
have configurations that are conditional upon this module being loaded.
Make sure that you understand how this is structured and especially whether or not you are making changes globally or just within a specific directive. The right config in the wrong place is still wrong!
Customising Apache
It's quite likely that you will not want Apache to run on your PC exactly as it is installed, the odds are that you will have a separate document root for your new website and perhaps additional resources that you wish to make available to your home network (which can include as few as ONE machines!) You may also wish to set up more than one website on your PC and have them simultaneously accessible, well Apache can do this too via the Virtual Host config options.
OK, what we will do now is take a guided 'walk' through the
httpd.conf
file pointing out the important and relevant
configuration options.
Note! Apache (and the Internet in general) has a very
clear unix ancestry. The use of backslashes as directory separators is a
MicroSoft abomination and has no place here except when issuing MS-DOS specific
commands! Forward slashes please!
This is a critical configuration line, if you ever decide to move your
Apache installation to another location you will need to update this line as
it tells Apache where to find its own files.
Note! The configuration file can be specified when
starting Apache at the command line using:
c:\apache\apache -f c:\path\alternate_httpd.conf
If this is not found then Apache defaults to:
APACHE.EXE_DIR/conf/httpd.conf
If this is not found either, Apache fails to start and returns an
error message.
This tells Apache which IP addresses (or domain name) to listen on
within your network. This is useful if configuring virtual hosts or setting your PC up
as a webserver for other machines on your network.
But for a simple webserver on your PC only you do not need to enter
anything here, comment this line out and Apache will listen on 127.0.0.1
or 'localhost' by default.
This binds Apache to the specified IP addresses (or domain name) to listen on within your network. This is similar to the listen directive above and shold also be commented out unless you are setting up virtual hosts, and have more than one machine. For more specific details and differences consult the Apache documentation.
This specifies the port that Apache will listen on, by default always port 80. There is no reason for you to change this. So don't!
This specifies a contact address that will appear in all administrative and error message that the server outputs.
This is the server name or title for the web service that Apache will provide. In this instance the server will respond to HTTP requests for: http://website
Note what the httpd.conf
file says about creating hostnames,
you will also need to identify this hostname to your PC so that it knows
to direct requests for that host to 127.0.0.1 and not to
the outside World via your Internet connection. See the Virtual Hosts Config details for
more information about how to set this up.
This is another critical configuration line, it points Apache to the
top-level directory or 'document root' of your website
files. If you do not intend to use the c:/apache/htdocs
directory for your website you will need to modify this line accordingly
and also make sure that you set up an appropriate
<Directory "c:/website">
directive block for whichever
directory you specify as your document root.
Now we come to the first of the directory
directives. The first is for the entire directory tree and
sets the overall security.
Security directives act on all sub-directories below the specified location.
The simplest and most secure way to set this is to deny access or features
at the highest level and then specifically allow them at the required
location.
So we have...
Options Includes
AllowOverride FileInfo
</Directory>
'Includes' enables SSI files but nothing else, and
specifically not CGI execution!
'AllowOverride FileInfo' allows specific directories
(and their subs) to have their configuration set via .htaccess
files within
the directory itself. This is a useful and very flexible method once
you have figured out how to make good use of the .htaccess
syntax and options.
The next directive is for whichever directory you want to set as the document root, this allows more options than the default system root config above.
Options Indexes MultiViews Includes ExecCGI
AllowOverride FileInfo
Order allow,deny
Allow from all
</Directory>
Additional options here are:
'Indexes' - Allows directory indexing, a direct view of the files
displayed within the browser window. Not always a good thing!
'MultiViews' - If Apache cannot locate the file it
wants, the MultiViews option enables it to make 'an educated guess' as to
which file or ehaviour is required. See the Apache documentation for more
on this.
'Includes' - See above
'ExecCGI' - Vital if you want to run CGI programs,
without this Apache will not invoke any executable resources anywhere that
the ExecCGI option is not explicitly declared.
'Order' specifies whether to apply the allow rules
before the deny rules. These rules determine which IP addresses or domain
names the services should be allowed for or denied to. Not strictly relevant
here as we have no deny rules set in this example, only an 'Allow
from all
' which should be fairly self explanatory.
You will need to set additional directory blocks for each additional directory that you will be serving web resources from that is not already within the scope of the document root else you will find that certain services do not work such as SSI files, .htaccess and CGI programs. This can cause some serious headaches so do be sure to get this bit right!
This directive is only relevant if the Apache 'mod_dir.c
'
module is loaded. Make sure that all
of the configuration for this falls within the directive block and also
that the block is correctly terminated with </IfModule>
.
Thus:
DirectoryIndex index.shtml index.html index.htm
</IfModule>
The above directive controls how Apache behaves if a directory is
requested but no file; http://localhost/path/
In this example Apache will search the directory for a file named
'index.shtml
' and serve it if found. If it isn't found it
then looks for 'index.html
' and if that isn't found for
'index.htm
'. If none of these are found it will then follow
the behaviour defined in the directory configuration for that directory.
If MultiViews
and/or Indexes
are specified as
Options
then a view into the directory is generated which shows
a 'file-manager' type display. Depending on the demands on your website
this may be a potential security issue. Either remove the config or if you
are dealing with a remote host server for your live website add an
index.html
file into every directory with a specific HTTP
re-direction to an appropriate location, even if it is just a page to say:
"We do not allow directory indexing on this
website!"
This specifies the name for the file-driven directory specific
configurations. .htaccess
is the traditional and default name,
there is no reason for you to change this.
Note!You will need to have the AllowOverride
FileInfo AuthConfig
directive enabled for this directory else the
.htaccess files will be ignored.
Note also the next section of the httpd.conf
file which
denies access specifically to the .ht*
security related files.
Make sure this remains consistent with whatever other changes you may make,
otherwise your security files can be retrieved and examined by others, you
certainly don't want that!
For more details and an example of this setup see the »Password
Controlled Access page.
The majority of webpages have a content-type of
text/html
. If Apache cannot determine what the content type
of a requested page is, this configuration option specifies what the
default content-type will be. There is no reason to change this.
If this option is enabled (On) then Apache will look up every IP address and log the domain name in the access and error logs. Without it Apache will just log the IP address. You are advised against turning this on as it really slows things down as every request will involve an additional look-up request. And if you don't have a permanent outside connection then this will not work at all.
The following directives are only relevant if the Apache aliases
module is loaded and so it all falls within the the <ifModule
directive for the mod_alias module. Make sure that all
of the configuration for this falls within the directive block and also
that the block is correctly terminated with </IfModule>
.
<Directory "d:/Apache/htdocs/manual">
Options Indexes MultiViews
AllowOverride None
Order allow,deny
Allow from all
</Directory>
The above alias makes the Apache documentation available even though
this is no longer under the document root. All occurrences of
/manual/
in the request URL will be redirected to the
specified location.
Note also that specific permissions and options are explicitly set
for what is now an external directory.
And if you install Perl or MySQL then you may wish to link directly to their web-documentation which ships with the installation. The required aliases are:
Alias /mysql/ "c:/mysql/docs/"
Alias /documents/ "c:/documents/"
Note also the option to link directly into your documents folder.
Access any of these by using a URL such as http://localhost/manual/
or http://localhost/perldocs/
and so on...
Note that the trailing slash is also part of the pattern, if it is
present in the alias definition line but not in the URL then the alias will
not be matched and no redirection will occur.
Apache will treat all scripts within the specified directory as
executable resources. This is where CGI programs should be installed.
Note that ExecCGI
is not required in the directory config for this directory
as this behaviour is automatically implied in this directory.
Note also that the cgi-bin directory does need necessarily need to be under the document root, you may for example choose to keep your Perl programs under a separate Perl directory elsewhere on your PC. If this is the case make sure you have set adequate directives for this directory.
The following directives are only relevant if the Apache mime types
module is loaded and so it all falls within the the <ifModule
directive for the mod_mime module. Make sure that all
of the configuration for this falls within the directive block and also
that the block is correctly terminated with </IfModule>
.
'Handlers' are 'Apache-speak' for denoting what to
do with, or how to treat, a particular resource. In this particular instance we
have identified all resources with a .cgi
or .pl
file extension as CGI-scripts. Apache will therefore 'know' that these files
are executable and will hand them off for this task.
Similar to the idea of a handler above, this directive tells Apache
that .shtml
files are of content-type text/html
.
Without this directive, .shtml files would be served according to the default content-type,
in this case 'text/plain
'.
This handler is especially useful and not enabled by default, you will
probably need to uncomment these lines to make it work...
This directive causes Apache to parse any outgoing file that has a
.shtml extension and look for
SSIs (serverside includes). These are instructions to stream the contents
of a particular file into the main output stream. This method is invaluable
for sharing common code between webpages in the most efficient possible way.
And that is it as far as running a single website on your PC is concerned. All that remains now is the virtual hosts configuration.
Configuring Virtual Hosts Under Win32
The configuration detailed so far above will cause Apache to serve the
website on your PC from wherever you specify and include any resources that
you have set up. The whole thing can be browsed as
http://localhost/
, http://127.0.0.1/
or as
http://website/
. As long as you only need that one site
you need read no further.
But if you want to develop several sites, or set aside a special website to access system utilities etc then you will need a way to make Apache serve more than one website simultaneously. Such a method is to set up virtual hosts whereby Apache responds to a number of different hostnames.
Note! Before you begin note the following warning within the
httpd.conf
file under the ServerName
config regarding
creating hostnames...
The issue is that when a particular domain is requested via your
browser such as: http://requesteddomain/
your PC will have no idea
where this domain might be located and so directs the request to the default
gateway which will be your Internet connection. Eventually the request will
reach the DNS servers of your ISP and start their journey across the 'net
to the required domain.
But if 'requesteddomain
' is in fact a virtual host set up
in your Apache configuration then you need to instruct your PC that requests
for 'requesteddomain
' should be directed internally to
'localhost
'.
Before referring HTTP requests to the outside World, Windows
checks the local network looking for a local IP address to send the request to.
In other words it needs to know where 'requesteddomain
' is
located.
Windows uses a file called hosts
usually located in the windows
directory; c:\windows\hosts
. This is another unix echo and should
be quite familiar to anyone who is familiar with /etc/hosts
on a unix box!
Here a domain name is mapped against the required IP address, in each
case on the local machine.
127.0.0.1 secondsite
127.0.0.1 thirdsite
Note! Windows reads this file once on start-up. If you change it you will need to restart your PC before the new hosts are recognised.
To enable virtual hosts uncomment this line. The asterisk denotes that Apache should listen for all domains, alternatively it can be configured to respond only to a specific IP address or domain.
After the above line you will need to set a default VirtualHost
directive and then a separate VirtualHost
configuration lock
for each host that you want to set up.
Each of these blocks resembles the global config in the main
httpd.conf
file in miniature. Any directive in the
httpd.conf
file is valid here and will of course be
specific to the virtual host block that you set it within.
So starting with the default VirtualHost directive...
ServerAdmin admin@website
DocumentRoot "c:/website"
ServerName localhost
</VisrtualHost>
No more config is required, this is the global config that you
have already set up for Apache in the rest of the httpd.conf
file.
Now you need to set blocks for each of the remaining hosts that you
require. For each one that is outside of the default document root you will
also need to add a specific directory directive.
<VirtualHost 127.0.0.1>
ServerAdmin admin@secondsite
DocumentRoot "c:/secondsite"
ServerName secondsite
ServerAlias *.secondsite
ScriptAlias /cgi-bin/ "c:/secondsite/cgi-bin/"
Alias /ssi/ "c:/secondsite/ssi/"
<Directory "c:/secondsite">
Options Indexes MultiViews Includes ExecCGI
AllowOverride FileInfo
Order allow,deny
Allow from all
</Directory>
</VirtualHost>
And that just about completes this 'simple' tour of the Apache configuration file, there is enough here to get your server running with most of the useful options setup but there is so much more in there if you take the time to poke about in the documentation and experiment. After all, now that you have a webserver all of your own there is nothing to stop you!