The goal of this guide
This guide is meant to give you a better understanding of how the Apache HTTP server works. It covers most of the modules and configuration directive that affects performance. However, it does not cover how to optimize third-party software such as database servers and/or PHP/CGI applications, as they planned to be covered in future guides.
The guide is aimed for new, as well as experienced Linux/UNIX users. It is assumed the reader is using a FreeBSD 7.0-RELEASE or later system. Though, since the configuration directives apply to any system running Apache 2.2.9 or later, feel free to run whatever system suits you best.
Installing Apache 2.2 and selecting modules
Select the modules to use
To install Apache, we will use the ports system that comes with FreeBSD. Make sure you have an up-to-date ports tree and install the www/apache22 port:
# csup -L2 -h cvsup.freebsd.org /usr/share/examples/cvsup/ports-supfile # cd /usr/ports/www/apache22 # make config
To make Apache as fast and small as possible, we make sure to disable any modules that we won't use. For example, all the AUTH* modules can most likely be disabled (as we assume this is a public webserver). However, make sure to enable the modules below, as they're covered in this article:
Important: mod_mem_cache requires threads support in the APR in order to compile.
After you've selected what modules you want, install www/apache22:# make install clean
Fix httpd.conf to make sure Apache starts during boot.
Because the default configuration that comes with the www/apache22 port is based on the default compile-configuration, we need to remove a few lines in order to make our server start. Once the compile is finished, we open /usr/local/etc/apache22/httpd.conf with our favorite editor and remove all the Order and Deny lines, as they require the mod_authz_host, which we disabled during compile-time. An example of those lines are:
Order allow,deny Allow from all
To make our server start during boot, we need to add apache22_enable="YES" to /etc/rc.conf:# echo apache22_enable="YES" >> /etc/rc.conf
Once we're done, we can let Apache test our configuration:# /usr/local/etc/rc.d/apache22 configtest
If Apache find any errors in our configuration, it will also tell us on which line the error was found. If that's the case, you will most likely get an error saying that a command is defined by a module that's not loaded (since we disabled it at compile-time) and we can safely remove this line.
If we get an error saying that Apache can't determine the server's fully qualified name, we need to make sure that our /etc/hosts is complete.
::1 localhost localhost.my.domain 127.0.0.1 localhost localhost.my.domain
With the following:
::1 localhost localhost.unixblog.org 127.0.0.1 localhost localhost.unixblog.org 18.104.22.168 server.unixblog.org server
Where 22.214.171.124 is the IP address of your machine, unixblog.org is your domain and server is the hostname.
Including more configuration files
When we're sure our configuration works, we move to the very bottom of httpd.conf. Since the www/apache22 port has a very structured configuration setup, it requires us to uncomment the lines below, as they hold a few configuration options mentioned later on.
Include etc/apache22/extra/httpd-mpm.conf Include etc/apache22/extra/httpd-default.conf
This option will take advantage of FreeBSD's optimizations for a listening socket. This feature make sure the server don't have to context switch several times before it performs the initial parsing of the HTTP request. By default, www/apache22 will not use this, but you can easily enable it in /etc/rc.conf:# echo apache22_http_accept_enable="YES" >> /etc/rc.conf
Also, make sure the required kernel module is loaded successfully:# kldload accf_http
For more information about this, please read man accf_http, which also provide information on how to load the module during the boot, if you don't have it compiled into your kernel.
EnableMMAP and EnableSendfile
These two options will enable system calls which are used to deliver files. This usually improves server performance as it avoids separate read and send operations, and buffer allocations. Enable these in /usr/local/etc/apache22/httpd.conf:
EnableMMAP On EnableSendfile On
However, these two options should not be enabled if the server is serving pages from a network-mounted filesystem.
Reduce expensive system calls
Don't resolve hostnames
When a user requests a page from your server, the server automatically log this. Apache can be configured to resolve the hostname of each user, causing a lot of unnecessary load and indirect causing slower server. In /usr/local/etc/apache22/extra/httpd-default.conf you can find the HostnameLookups option. At the time of writing, this option is disabled by default for obvious reasons.HostnameLookups Off
If our server receives a lot requests, the logging itself can be quite heavy, especially if we have slow disks. Instead of writing the logs directly to the disks, we can use BufferedLogs, which let us save the logs temporarily in memory. This cause the logging to be nicer to our disks. This option is considered experimental and disabled by default, so use it with caution.BufferedLogs On
FollowSymLinks and FollowSymLinksIfOwnerMatch
If neither FollowSymLinks or FollowSymLinksIfOwnerMatch is specified as an Option for the directory where you host your files, Apache will issue extra system calls to check up on symbolic links for every directory in the path to the file requested. These checks are never cached and will occur on every single request. It may not seem like a big deal, but imagine if you have 20 requests a second, and the files are stored under /usr/local/www/apache22/data/, then it adds up pretty fast.
The recommended thing to use is FollowSymLinks, as it requires the least system calls per request. This is the default settings but we can change this in /usr/local/etc/apache22/httpd.conf:
<Directory "/usr/local/www/apache22/data"> Options FollowSymLinks </Directory>
If this is set to anything other than None, Apache will try to access .htaccess in every directory in the path to the requested file, pretty much like with FollowSynLinks above. Since we've disabled all the auth-modules, .htaccess won't be used, and can safely set this to None:
<Directory "/usr/local/www/apache22/data"> AllowOverride None </Directory>
The Keep-Alive feature allow multiple requests to be sent over the same TCP connection. The idea is to avoid the overhead by doing the initial TCP handshake for every request. This saves both bandwidth and time and there has been reported performance increase by up to 50% for HTML documents with a lot of objects. These settings are specified in /usr/local/etc/apache22/extra/httpd-default.conf:KeepAlive On
When Keep-Alive is used, child-processes will be kept busy doing nothing but waiting for more requests. This option specifies for how long it should wait for new requests before closing it. The default is 5 seconds, which usually is enough time for most requests. The trade-off here is between bandwidth and memory, as Apache will be spawning new child-processes in order to handle new connections.KeepAliveTimeout 5
This option limits the number of requests per connection when KeepAlive is enabled. The default value is 100 and it's recommended to keep this fairly high for maximum server performance. We can disable this completely, set this to 0.KeepAliveRequests 100
MinSpareServers and MaxSpareServers tells Apache how many child-processes to keep running while waiting for requests. If MaxSpareServers is set too high, Apache will have a lot of child-processes running, doing nothing but consuming memory. If MinSpareServers is set too low, Apache will have to spawn additional child-processes upon incoming requests, which cause really poor server performance. StartServers specifies the number of child-processes that should be created on startup.
MaxClients (or ServerLimit, which is an alias for MaxClients when the preform MPM is used) sets the maximum number of child-processes that can be created. The more child-processes running, the more memory is used. If our server start swapping it will perform really bad. To avoid this, we simply check how much memory each process is using:
# /usr/local/etc/rc.d/apache22 start Performing sanity check on apache22 configuration: Syntax OK Starting apache22. # ps aux -U www USER PID %CPU %MEM VSZ RSS TT STAT STARTED TIME COMMAND www 99372 0.0 0.3 183128 6856 ?? I 10:06PM 0:00.04 /usr/local/sbin/ht www 99373 0.0 0.3 183128 7040 ?? I 10:06PM 0:00.03 /usr/local/sbin/ht www 99374 0.0 0.3 183128 6560 ?? I 10:06PM 0:00.02 /usr/local/sbin/ht www 99375 0.0 0.3 183128 6952 ?? I 10:06PM 0:00.04 /usr/local/sbin/ht www 99376 0.0 0.3 183128 6576 ?? I 10:06PM 0:00.02 /usr/local/sbin/ht www 99378 0.0 0.3 183128 6800 ?? I 10:08PM 0:00.03 /usr/local/sbin/ht [...] # /usr/local/etc/rc.d/apache22 stop Stopping apache22. Waiting for PIDS: 99371. #
This tells us that each process use ~7Mb of memory. Saying we have a total of 1Gb of memory, we quickly see that we can use a maximum of ~145 processes before our system commits harakiri.
Remember that this calculation above assumes that we use all memory for Apache here, if you run something else (e.g, MySQL or PostgreSQL) you will have to take that into consideration as well. It is also recommended to benchmark/stress-test the server, in order to get a more accurate value of the actual memory usage.
The MaxRequestsPerChild option sets the limit on the number of requests that an individual child-process will handle. Once the number of requests has been reached, the child-process is killed and if needed, a new one is created. The default value is 10000 and you can turn this off by setting it to 0. Though, it is recommended to have this enabled, to prevent any memory leaks.
The following values are specified in /usr/local/etc/apache22/extra/httpd-mpm.conf:
<IfModule mpm_prefork_module> StartServers 10 MinSpareServers 10 MaxSpareServers 15 ServerLimit 128 MaxRequestsPerChild 10000 </IfModule>
To save some memory for the system itself, we set ServerLimit to 128 instead of 145.
With compression enabled, the server will compress the data before sending it to the client, meaning it's less data to send over the network, saving both bandwidth and giving a faster response time. All major browsers support this option and it's fully covered in the HTTP/1.1-protocol.
AddOutputFilterByType (quick and dirty)
To enabled compression by type, we add the following to /usr/local/etc/apache22/httpd.conf:AddOutputFilterByType DEFLATE text/html text/plain text/xml
SetOutputFilter (the proper way)
This will compress all plaintext, HTML and XML-documents in every directory. Even though it works, it may not play well with caching proxies. A more complicated, yet better way of doing this is:
<Location /> # Insert filter SetOutputFilter DEFLATE # Netscape 4.x has some problems... BrowserMatch ^Mozilla/4 gzip-only-text/html # Netscape 4.06-4.08 have some more problems BrowserMatch ^Mozilla/4\.0 no-gzip # MSIE masquerades as Netscape, but it is fine BrowserMatch \bMSIE !no-gzip !gzip-only-text/html # Don't compress images SetEnvIfNoCase Request_URI \ \.(?:gif|jpe?g|png)$ no-gzip dont-vary # Make sure proxies don't deliver the wrong content Header append Vary User-Agent env=!dont-vary </Location>
This will compress everyting other than images and it plays well with various old browsers and caching proxies.
No matter which method you use, compression comes at the cost of CPU, but in return saves bandwidth. To modify the level of compressed, we can change DeflateCompressionLevel to a value between 0 (lesser compression) to 9 (more compression). For example:DeflateCompressionLevel 7
Making sure our documents expire (mod_expire)
By specifying when a document expires, we explicitly tell the clients that "hey, this file won't be modified until tomorrow". This will prevent the clients from asking our server if the file has been modified and therefore we reduce the number of requests and save both bandwidth and system resources.
The negative effect this might cause, is that we lose control of our content. Even if we change the content of our webpage, clients will still show the cached version until it expires, or the user manually clear his cache.
This is a simple sample configuration for the impatient:
ExpiresActive On ExpiresDefault "access plus 1 hour"
The ExpiresDefault set the default expire time for all files on our server. Since some files are naturally updated more frequently than others, this may not be the best way to this. Instead, we use separate expire-times for different file-types:
ExpiresActive On ExpiresByType text/html "access plus 1 hour" ExpiresByType text/css "access plus 1 day" ExpiresByType image/png "access plus 1 day" ExpiresByType image/jpeg "access plus 1 day" ExpiresByType image/gif "access plus 1 day"
Remember that we don't want to set these values too high, since we can never make the users clear their cache. Doing so will cause a lot of extra work to any developer working on the pages located on our server. Though, if a cached file is renamed on the server, it will be considered a new file and therefore downloaded by the client.
Caching Dynamic Content
Apache comes with mod_cache (and it's child-modules mod_mem_cache and mod_disk_cache) which offer an intelligent and HTTP-aware caching. This allow us to cache the content locally on the server. This will usually improve server performance for dynamic pages generated by PHP, Ruby or Perl. Since the dynamic pages are generated every time they're requested, this can cost a lot of CPU on a busy server.
Caching to Disk (mod_disk_cache)
By using mod_disk_cache, when a request is made, our server will store a static copy of the dynamic page on disk for future requests. Next time someone request the same file, it will be served from the cache instead.
CacheRoot /usr/local/www/cache CacheEnable disk / CacheDirLevels 2 CacheDirLength 1 CacheMinFileSize 32 CacheMaxFileSize 1048576
Remember to create /usr/local/www/cache/ so that Apache can use it
# mkdir /usr/local/www/cache # chown www:www /usr/local/www/cache # chmod 750 /usr/local/www/cache
Caching to Memory (mod_mem_cache)
As for mod_mem_cache, it can be configured to either cache open file descriptors or saving cache objects to the memory. Even though this might seem faster compared to mod_disk_cache, most modern operating systems cache file-data on a kernel level which cause files that are frequently requested to be served from memory anyway. The kernel also knows when files are modified or deleted and can automatically remove file contents from the cache when necessary.
CacheEnable mem / MCacheSize 4096 MCacheMaxObjectCount 100 MCacheMinObjectSize 1 MCacheMaxObjectSize 2048
The default expiry period for a cached file is one hour, but this can easily be changed with CacheDefaultExpire option. This default is only used when no Expires header was included the request. We may want to use mod_expire to make sure our documents have a proper Expire header. As an opposite to CacheDefaultExpire, we use CacheMaxExpire to specify the maximum time a cache is considered valid.
Also, if we decide to use mod_disk_cache, we also want to add htcacheclean to crontab, in order to clear out any stale cache objects.
Exclude a file or directory from the cache
Since mod_cache is intelligent and HTTP-aware, sending a header like Cache-Control: no-cache or Pragma: no-cache will prevent the file from being cached. An alternative to this, is to use the CacheDisable option:CacheDisable /private
Even if Apache offers a wide set of options and tools to help us optimize our server, there's no universal way of doing this. Every server has different hardware setup with different bottlenecks. Experiment with different configurations and test your servers' performance using a benchmarking tool (ab for example) and understand how the server works.