Friday, July 6, 2012

So Pythonistas, you want to get rid of the GIL...

There is no shortage of hate for the GIL. There is a slight problem, though. The GIL might be the cause of Python's single-core-only utilization, but it's not the root reason.

The origin of the GIL is to keep the interpreter internals sane when running with multiple kernel threads. When it comes to problems involving parallelism, the easy solution is simple: serialize everything. So you get the GIL. Now, if that were the end of it, getting rid of the GIL would probably not that challenging. But, for better or for worse, the GIL also made a number of operations atomic that would not be in other languages. The Python FAQ has this example. Python programmers made use of these benefits in CPython, regardless of if the language designers actually guaranteed them. But at this point, it doesn't matter. The amount of code that depends on this behavior is large.

Greg Stein attempted to remove the GIL in Python 1.5, but programs ran about 2x slower than with the GIL. The reason being: in order to give people the guarantees they have grown accustom to in the previous paragraph, you need to do the locking on those operations for them. Where there was one a single lock, you now have a lock per object. And it is difficult to determine if an object will be accessed by multiple threads so the naive solution is to lock the object every time it's accessed in a way that needs to be atomic. This kind of fine-grained locking is expensive.

So this attempt didn't work. And it wasn't really a big deal. Multicore CPUs weren't that ubiquitous and people weren't doing things that would benefit that much from multiple cores. But now multicore is the rage and people believe that their Python programs will benefit from it. The common advice is simply to use multiprocessing, but people tend to find this inadequate.

PyPy is trying to solve this using STM, and blog their progress. PyPy doesn't seem to be a solution for a lot of people yet and it's unclear how successful the STM approach will be.

If you really think multicore support is important to Python, then you don't want to pitch a fit about the GIL. What you want to do is convince the Python designers that you are OK with giving up those guarantees you have been taking advantage of over the years. You will rewrite your code to not make use of the guarantees. Then they can get rid of the fine-grained locking.

But... before you say "sure", make sure you know what you're getting into. If L1.pop() is no longer thread-safe, then what does it mean if two threads access L1 in parallel? I don't know much about threaded memory models but it could get pretty complicated. You might not be able to define all states of the program at that point.

In the future, before you pour too much hate on the GIL, remember: really just a symptom. The actual problem is, for simplicity, Python makes a number of guarantees that make executing performant code harder without the GIL than with it. And also, not everyone hates the GIL, some people are fond of the guarantees. Like this guy.


  1. The rabbit hole goes deeper than that. The C/C++ use the simple way out where any problematic concurrent access is simply declared Undefined Behaviour, leading to anything from segmentation fault through memory leak or internal assertion deep in standard library to making daemons flying from your nose, but high level virtual machine like python is not supposed to ever do any of those even if your program is wrong at it's level. Now since cPython uses reference counting, even simple assignment becomes quite complicated to make sure the reference counts stays in sync with the value of the variable (IIRC PyPy uses garbage collection instead, which makes things easier).

    On another note there are many languages and python not being suitable for heavily parallel computation might really not be such big loss. If you need raw performance, python is not very good choice anyway.

  2. This could have been part of the Py3k push. Too late now!