Friday, October 14, 2011

Does Python Scale?

I wonder how many times I've been asked that question over the years.  Often, it's not even in the form of a question (Sorry, Mr. Trebek) but rather stated emphatically; "Python doesn't scale".  This can be the start of long, heated discussions involving Global Interpreter Locks, interpreters vs. compilers, dynamic vs. static typing, etc.  These discussions rarely end satisfactorily for any of the parties involved.  And rarely are any opinions changed as a result.

So, does Python scale?

Well, YouTube is written mostly in Python.  DropBox is written almost entirely in Python.  Reddit.  Quora.  Disqus.  FriendFeed.  These are huge sites, handling gazillions of hits a day.  They are written in Python.  Therefore, Python scales.

Yeah, but what about that web app I wrote that one time.  Hosted on a cheapo, oversubscribed VPS, running straight CGI talking to a remote MySQL database running in a virtual machine on my Macbook Air.  That thing fell over like a drunken sailor when I invited a few of my friends to go check it out.  So, yeah.  Forget what I said before.  Obviously Python doesn't scale.

The truth is, it's the wrong question.  The stuff that allows Dropbox to store a million files every 15 minutes has little to do with Python just as the things that caused my feeble web app to fail had little to do with Python.  It has to do with the overall architecture of the application.  How databases are sharded, how loosely or tightly components have been coupled, how you monitor, and how you react to the data your monitoring is providing you.  And lots of other stuff.  But you have to deal with those issues no matter what language you write the system in.

No reasonable choice of computer language is going to guarantee your success or your failure.  So pick the one you are most productive in and focus on properly architecting your app.  That scales.


  1. Well stated. This is the same argument (and answer) when people try to compare raw performance of one language vs. another. Raw performance, raw complexity, raw anything doesn't work when it comes to comparisons. End results are the key, and the whole IS greater than the sum of the parts.

  2. My general answer to that question is.

    No . Python doesn't scale.

    But the good news is, it doesn't have to. Since it plays a very small role in scalability. In the end the architecture of your app will make it scale or not.

    Node is here to prove it. Who would have thought 5 years ago that Javascript would provide such great base to build scalable web apps on?

  3. Interesting that you mention "huge sites" which all have huge user bases, does that mean that they are big projects with huge code bases? No. Sure you can get the execution time to scale in python, because as Eduardo puts it; it's all about the architecture, but does it scale in development time? I think not because in the end you rely on test cases to verify that you don't break things whilst in statically typed languages, like Java, you can utilize the compiler more (javac is my best friend). Sure you still need tests to verify things but as you can rely on the compiler to give you errors when something doesn't build you don't have to run those tests as often. In big projects you often end up with a lot of test cases, the time to run all of them to verify that you haven't broken anything often increases incrementally (this may not be true if you only employ super humans that refactor often). This is why I often say; use python for small code bases because when prototyping a product you want to be fast and python is faster in the beginning but as your code base grows you'll se that your development efficiency will stagger and this is when you should consider a statically typed language instead. A feature that is critical when working in huge code bases is to see how/where code is used. When developing Java in eclipse you press ctrl+shift+g, I've tried the same feature for a python project and it doesn't work because the types in python aren't declared so the search can't reliably give you the correct answer, sure it can guess but who likes guessing? I think this is the killer feature of dynamically typed languages.

  4. Scalability is not a matter of languages, but a matter of architecture.

    Of course, there are some languages which make easy to set up some scalable architecture (Erlang comes to mind), but it's really more complex than Language A don't scale/Language B does (or SQL doesn't scale/NoSQL does)

    great post