Pushing Apache Performance
You might know that pocoo.org had a an interesting availability, far below 99%. The problem we have had is that all pocoo services (and that are quite a lot) are hosted on a XEN instance with less less than 400MB RAM. With two different database servers (MySQL and postgres), hgweb, 7 trac instances, one pastebin, one dynamic project webpage, a blog and a bunch of static webpages apache loved to eat more memory than being available thus swapping.
Although there would certainly a possibility to get a better server or allocate more memory for the XEN instance it's possible to reconfigure things so that you can still use them. Here is what we did one week ago. We managed it to have a load of around 0.1 - 1.5 since a week, and a memory usage of around 200MB. Before that our load was between 2 and 50, depending on the time of the last apache restart ;-)
Step one: switch to worker. Apache prefork is certainly the worst thing you can do if you have few memory in your server. The reason for this is that prefork forks. While I agree that threads are problematic in many terms (especially PHP) they are the best way to keep your memory usage low. But worker means that you cannot use mod_php4/5. Because TextPress is still not ready I needed PHP support, so I've chosen to host it as CGI, more later.
The worker MPM has quite a lot of configuration parameters, and to be fair, I have no idea how the work in detail. The basics are covered in the apache documentation. The configuration values that worked best for us where these:
<IfModule worker.c> ServerLimit 5 StartServers 1 MaxClients 30 MinSpareThreads 5 MaxSpareThreads 30 ThreadsPerChild 15 MaxRequestsPerChild 500 </IfModule>
These settings start between 1 and 5 server processes (usually two because of MaxClients) where each of them handles 15 threads. After 500 requests the dispatcher is restarted. The latter is a very good idea because nearly every application leaks memory.
Now because I lost my blog a second after installing the worker module that was the next thing I changed. With version 5 onwards PHP is usually compiled as cgi/fastcgi binary so ready to rumble if you have mod_fastcgi. We do, but I have disabled it together with all unused modules. Although FastCGI is a lot faster than CGI and a really good idea if you expect many hits I've chosen CGI for the blog to not have a persistent PHP interpreter in the memory. Sadly this blog has the fewest hits of all our sites so until more visitors come here CGI is the way to go. If you are too lazy too look up the configuration, here is ours:
AddType application/x-httpd-php .php Action application/x-httpd-php /php5-cgi
It's that simple. If you want to switch to fastcgi, all you have to do is to enable the FastCGI module and enable it for /php5-cgi.
But PHP will disappear on that server sooner or later anyways, so let's continue with speeding up the snakes. There the answer is simple: If you have an apache, use mod_wsgi. And then it depends on your applications. If you have different applications that do not interfere with each other and are threadsafe put them into the same process group and enable threading. Otherwise put them into different groups and use threading if they are thread safe. And let mod_wsgi kill processes after a thousand requests. The mod_wsgi wiki is full of examples, you will find all there.
The last step is making apache as such faster. Remove all .htaccess files and put their contents into the apache configs. Then set AllowOverride to None so that apache doesn't test for them any more. Set down your connection timeout to say 150 seconds, the number of MaxKeepAliveRequests down to something lower like 100 and the KeepAliveTimeout to 3. That might be annoying for people with slow network connections but it gives the apache the opportunity to mark worker threads as available sooner. Enable logrotation, log fewer stuff and do not log hostnames (check if HostnameLookup is set to Off).
That being said the only thing that is left is playing around with some of the configuration values of your database servers and other running processes that might consume memory.
Here the result as a fancy graph:

A further way of reducing what Apache is doing is to turn off logging of requests into the access log altogether. This can be done by not defining CustomLog directive and as far as raw performance goes has more impact than one would expect. Of course, since the bottleneck of web applications is generally elsewhere, eg. database access, in the greater scheme of things the lower overhead may not be noticeable. Although this can help in increasing raw throughput for simple requests and static files, I'm not sure how memory use is affected, so if you did try this would be interested in any feedback.
Now if you like having the access log present so you can gather your own site statistics, one alternative would be to setup all your pages to include tags for Google Analytics and use it to track site usage instead. This way the overhead of tracking site usage is pushed onto the client browser and Google instead, and you aren't wasting your own systems limited resources.
BTW, this only disables logging of requests, any details of exceptional events would still be logged to the error log. Thus, if your Python web applications throw a wobbly, you will still be able to see why. :-)
— Graham Dumpleton on Monday, October 1, 2007 4:15 #
IMHO better way to go is installing of nginx as frontend and serving static images through it. After that you can disable KeepAlive in apache completely.
— Alexander Solovyov on Monday, October 1, 2007 7:58 #
@Graham: I'll come back to that once we run out of resources again :-) @Alexander: nginx is certainly a good server for static files and I may think about it once the load increases again but for the moment every additional process requires more time to maintain and update it.
— Armin Ronacher on Monday, October 1, 2007 9:10 #
Hey this may seem like a weird request, but when you switch over to Textpress can you remember to redirect your lucumr.pocoo.org/cogitations/feed/atom/ feed as well?
— Paul on Monday, October 1, 2007 10:01 #
modphp most certainly does support threads, but only if it's compiled to do so. Obviously a canned modphp for prefork won't run with worker, but worker isn't fundamentally incompatible with mod_php. This is what TSRM is all about! (And although TSRM can't protect you if you try to use non-threadsafe C libraries from worker, that's not technically PHP's fault.)
— sapphirecat on Monday, October 1, 2007 17:36 #
Well, modphp might be threadsafe to some part if compiled with support for that but before I compile modphp by hand I use a binary. That has the advantage that apt-get keeps it up to date automatically.
@Paul: I hope I don't forget ;-)
— Armin Ronacher on Monday, October 1, 2007 19:31 #