Is Django too slow?

Does Django have "bad performance"? The framework is now 15 years old. Is it out of date? Mostly, no. I think that Django's performance is perfectly fine for most use-cases. In this post I'll review different aspects of Django's "performance" as a web framework and discuss how you can decide whether it's a good fit for your web app.

Benchmarks

Let's start by digging into the ad-hoc web app performance benchmarks that you'll see pop up on Medium from time to time. To produce a graph like the one below, the author of this article sets up a server for each of the frameworks tested and sends them a bunch of HTTP requests. The benchmarking tool counts number of requests served per second by each framework.

benchmark

I think these kind of measurements are irrelevant to practical web development. There are a few factors to consider:

Is the metric being measured actually of interest? What's a good baseline? Is 100 requests per seconds good, or pathetic? Is 3000 requests/s practically better than 600 requests/s?
Is the test representative of an actual web app workload? In this case, how often do we just send a static "hello world" JSON to users?
Are we comparing apples to apples? For example, ExpressJS has 3 layers of relatively simple middleware enabled by default, wheras Django provides a larger stack of middleware features, "out of the box"
Has each technology been set up correctly? Was Gunicorn, for example, run with an optimal number of workers?

This kind of naive comparsison is a little misleading and it's hard to use it to make practical decisions. So, what kind of performance metrics should you pay attention to when working on your Django app?

What do you mean by "performance"?

When you ask whether a framework or language is "slow", you should also ask "slow at what?" and "why do you care?". Fundamentally I think there are really only two performance goals: a good user experience and low hosting cost. How much money does running this website cost me, and do people enjoy using my website? For user experience I'm going to talk about two factors:

Response time: how long people need to wait before their requests are fulfilled
Concurrency: how many people can use your website at the same time

Cost, on the other hand, is typically proportional to compute resources: how many CPU cores and GB of RAM you will need to run your web app.

Response time in Django

Users don't like waiting for their page to load, so the less time they have to wait, the better. There are a few different metrics that you could use to measure page load speed, such as time to first byte or first contentful paint, both of which you can check with PageSpeed Insights. Faster responses don't benefit your user linearly though, not every 5x improvement in response is equally beneficial. A user getting a response in:

5s compared to 25s transforms the app from "broken" to "barely useable"
1s compared to 5s is a huge improvement
200ms instead of 1s is good
50ms instead of 200ms is nice, I guess, but many people wouldn't notice
10ms instead of 50ms is imperceptible, no one can tell the difference

So if someone says "this framework is 5x faster than that framework blah blah blah" it really doesn't mean anything without more context. The important question is: will your users notice? Will they care?

So, what makes a page load slowly in Django? The most common beginner mistakes are using too many database queries or making slow API calls to external services. I've written previously on how to find and fix slow database queries with Django Debug Toolbar and how to push slow API calls into offline tasks. There are many other ways to make your Django web pages or API endpoints load slowly, but if you avoid these two major pitfalls then you should be able to serve users with a time to first byte (TTFB) of 1000ms or less and provide a reasonable user experience.

When is Django's response time not fast enough?

Django isn't perfect for every use case, and sometimes it can't respond to queries fast enough. There are some aspects of Django that are hard to optimise without giving up much of the convenience that makes the framework attractive in the first place. You will always have to wait for Django when it is:

running requests through middleware (on the way in and out)
serializing and deserializing JSON strings
building HTML strings from templates
converting database queries into Python objects
running garbage collection

All this stuff run really fast on modern computers, but it is still overhead. Most humans don't mind waiting roughly a second for their web page to load, but machines can be more impatient. If you are using Django to serve an API, where it is primarily computer programs talking to other computer programs, then it may not be fast enough for very high performance workloads. Some applications where you would consider ditching Django to shave off some latency are:

a stock trading marketplace
an global online advertisement serving network
a low level infrastructure control API

If you find yourself sweating about an extra 100ms here or there, then maybe it's time to look at alternative web frameworks or languages. If the difference between a 600ms and 500ms TTFB doesn't mean much to you, then Django is totally fine.

Concurrency in Django

As we saw in the benchmark above, Django web apps can handle multiple requests at the same time. This is important if your application has multiple users. If too many people try to use your site at the same time, then it will eventually become overwhelmed, and they will be served errors or timeouts. In Australia, our government's household census website was famously overwhelmed when the entire country tried to access an online form in 2016. This effect is often called the "hug of death" and associated with small sites becoming popular on Reddit or Hacker News.

A Django app's WSGI server is the thing that handles multiple concurrent requests. I'm going to use Gunicorn, the WGSI server I know best, as a reference. Gunicorn can provide two kinds of concurrency: multiple child worker processes and multiple green threads per worker. If you don't know what a "process" or a "green thread" is then, whatever, suffice to say that you can set Gunicorn up to handle multiple requests at the same time.

What happens if a new request comes in and all the workers/threads are busy? I'm a little fuzzy on this, but I believe these extra requests get put in a queue, which is managed by Gunicorn. It appears that the default length of this queue is 2048 requests. So if the workers get overwhelmed, then the extra requests get put on the queue so that the workers can (hopefully) process them later. Typically NGINX will timeout any connections that have not received a response in 60s or less, so if a request gets put in the queue and doesn't get responded to in 60s, then the user will get a HTTP 504 "Gateway Timeout" error. If the queue gets full, then Gunicorn will start sending back errors for any overflowing requests.

It's interesting to note the relationship between request throughput and response time. If your WSGI server has 10 workers and each request takes 1000ms to complete, then you can only serve ~10 requests per second. If you optimise your Django code so that each request only takes 100ms to complete, then you can serve ~100 requests per second. Given this relationship, it's sometimes good to improve your app's response time even if users won't notice, because it will also improve the number of requests/second that you can serve.

There are some limitations to adding more Gunicorn workers, of course:

Each additional worker eats up some RAM (which can be reduced if you use preload)
Each additional worker/thread will eat some CPU when processing requests
Each additional worker/thread will eat some extra CPU when listening to new requests, ie. the "thundering herd problem", which is described in great detail here

So, really, the question of "how much concurrency can Django handle" is actually a question of "how much cloud compute can you afford":

if you need to handle more requests, add more workers
if you need more RAM, rent a virtual machine with more RAM
if you have too many workers one server and are seeing "thundering herd" problems, then scale out your web servers (more here)

This situation is, admittedly, not ideal, and it would be better if Gunicorn were more resource efficient. To be fair, though, this problem of scaling Django's concurrency doesn't really come up for most developers. If you're working at Instagram or Eventbrite, then sure, this is costing your company some serious money, but most developers don't run apps that operate at a scale where this is an issue.

How do you know if you can support enough concurrency with your current infrastructure? I recommend using Locust to load test your app with dozens, hundreds, or thousands of simultaneous users - whatever you think a realistic "bad case" scenario would look like. Ideally you would do this on a staging server that has a similar architecture and compute resources to your production enviroment. If your server becomes overwhelmed with requests and starts returning errors or timeouts, then you know you have concurrency issues. If all requests are gracefully served, then you're OK!

What if the traffic to your site is very "bursty" though, with large transient peaks, or you're afraid that you'll get the dreaded "hug of death"? In that case I recommend looking into "autoscaling" your servers, based on a metric like CPU usage.

If you're interested, you can read more on Gunicorn worker selection and how to configure Gunicorn to use more workers/threads. There's also this interesting case study on optimising Gunicorn for arxiv-vanity.com.

When is Django's concurrency not enough?

You will have hit the wall when you run out of money, or you can't move your app to a bigger server, or distribute it across more servers. If you've twiddled all the available settings and still can't get your app to handle all the incoming requests without sending back errors or burning through giant piles of cash, then maybe Django isn't the right backend framework for your application.

The other kind of "performance"

There's one more aspect of performance to consider: your performance as a developer. Call it your takt time, if you like metrics. Your ability to quickly and easily fix bugs and ship new features is valuable to both you and your users. Improvements to the speed or throughput of your web app that also makes your code harder to work with may not be worth it. Cost savings on infrastructure might be a waste if the change makes you less productive and costs you your time.

Choosing languages, frameworks and optimisations is an engineering decision, and in all engineering decisions there are competing tradeoffs to be considered, at least at the Pareto frontier.

If raw performance was all we cared about, then we'd just write all our web apps in assembly.

web development in assembly

Next steps

If you liked reading about running Django in production, then you might also enjoy another post I wrote, which gives you a tour of some common Django production architectures. If you've written a Django app and you're looking to deploy it to production, then you might enjoy my guide on Django deployment.

If you have any feedback or questions email me at [email protected]

Blog

YouTube

Twitter

LinkedIn

Planning a new web project?

I can help

Is Django too slow?

Benchmarks

What do you mean by "performance"?

Response time in Django

When is Django's response time not fast enough?

Concurrency in Django

When is Django's concurrency not enough?

The other kind of "performance"

Next steps

Is Django too slow?

Benchmarks

What do you mean by "performance"?

Response time in Django

When is Django's response time not fast enough?

Concurrency in Django

When is Django's concurrency not enough?

The other kind of "performance"

Next steps

Get more backend web development tips