FastCGI, CGI, and mod_python

January 20, 2009

As I began looking at deployment options for Django and Python in general, I was faced with some choices about what method the web server should use to process the requests. I'm relatively new to the server side of things, and I didn't find a lot of good resources outlining my options.

The Django book recommends using Apache with mod_pyhton and PostgreSQL for the database. It didn't go into many details over these choices, so I began looking for information about what they were and why there performance was to be preferred.

As I started looking at web server software, I came to the conclusion that Apache was not the best option. Lighttpd and Nginx both heavily outperform it in several benchmarks that I read, and they both use much less memory. This is important to me because I am hosting on a machine with fairly limited RAM.

I decided to use Lighttpd because it performs well and is a little more developed than Nginx seemed to be. From here, I needed to figure out how to get Lighttpd to run my Python app. Using Apache and mod_python makes this relatively simple. The Python interpreter is loaded into Apache via the module and then Apache is able to execute the Python code internally and returns the response to the user.

Lighttpd has no mod_python. I've read that there are some good reasons for this, but I won't go into them here. Basically, the way that dynamic content is generally server using Lighttpd and nearly all web servers is through the use of CGIs.

CGI stands for Common Gateway Interface, and it allows the web server to offload its request to another application which will then return an appropriate response. FastCGI works in a similar, yet critically different way, which I will explain in a moment. The basic process is show below:

"HTTP request/response example"

Here are the basic steps for CGI:

  1. HTTP request is sent from the browser to the server.
  2. The web server processes the request and determines that it should be processed by a particular CGI script or application.
  3. The CGI process is spawned, processes the request, and returns a response to the web server.
  4. The web server receives the response and passes it on to the browser.
  5. The browser receives the response and renders the page.

Hopefully that makes sense. The web server handles the requests themselves, but it then spawns a CGI process to handle most of the work. This is brilliant in its simplicity, but it creates some problems. Initially, CGI apps were either small, written in compiled code, or both. Most web applications no longer fit this bill. Current web applications are nearly always large and interpreted.

This creates a large performance problem. For every request the web server receives, it must create a new process to generate the response. This isn't always such a big deal if the processes are small and can execute natively, but this isn't the case with modern languages and web frameworks. If my Django application were called using a CGI, the entire python interpreter would need to be loaded for every single request, then the code itself would need to be loaded and parsed, and finally the code could actually execute and return a response. That is a huge waste.

From what I have read, that is essentially the reason that FastCGI was created. It uses a very similar API to standard CGI, but it does not require a new process to be spawned for each request. Instead of taking in parameters via stdin and environment variables as CGI does, it simply listens on a TCP or UNIX socket and waits to handle requests. The web server forwards on requests to this process and in much the same way. The new process looks like this for FastCGI:

  1. HTTP request is sent from the browser to the server.
  2. The web server processes the request and determines that it should be processed by a particular FastCGI daemon.
  3. The FastCGI process is spawned, loads the interpreter, processes the request, and returns a response to the web server.
  4. The web server receives the response and passes it on to the browser.
  5. The browser receives the response and renders the page.

This is clearly a far superior option to CGI, and it has some strengths and weaknesses vs mod_python.

Hopefully this information will be helpful to someone. I'm just learning this stuff myself, so if I've got anything wrong feel free to let me know.

Check out my other pages tagged "blog".