Ryan Kelly

I am a freelance software developer based in Melbourne, Australia. Most of my days are spent coding in Python and JavaScript, for a variety of open-source projects as well as some commercial endeavours. I also maintain a strong interest in logic programming, mainly as a result of my doctoral thesis. I am available for contract development and consulting work; please read more about me and check out my curriculum vitae if you're interested.

Feb. 5, 2010

A GIL Adventure (with a happy ending)

I just halved the running time of one of my test suites.

The tests in question are multi-threaded, and while they perform a lot of IO they still push the CPU pretty hard. For some time now, nose has been reporting a happy little message along these lines:

Ran 35 tests in 24.893s

I wouldn't have though anything of it, but every so often this number would drop dramatically – often down to as little as 15 seconds. After a lot of puzzling, I realised that the tests would run faster whenever I had another test suite running at the same time. Making my computer work harder made these tests run almost twice as fast!

Could it be? Yes, I was finally seeing a manifestation of Python's dreaded Global Interpreter Lock - a.k.a. the "GIL of Doom". Because I'm running on a dual core system, the different threads in this test suite were spreading themselves over both processors and engaging in an epic GIL Battle that bogged down the whole process.

The typical response to this awful multi-core behaviour is "just use multiprocessing". That's not an option here, not least because these tests are supposed to be checking the thread safety of my code!

Continue reading...

Sept. 9, 2009
[Python]

Mimetypes and Threading don't mix

I've just spent weeks (yes, weeks) battling a bug that turns out to have been caused by everyone's favourite broken stdlib module, mimetypes. I'm far from the first to be bitten by this module's strangeness – Jacob Rus has compiled a long list of reasons why the mimetypes module is pathologically broken, while Armin Ronacher recently got a 1000% speedup just by changing the way he imported things from the module (yes, 1000%).

So consider this another little heads-up about the mimetypes module: it doesn't play nice with threads.. If two threads call mimetypes.guess_type at the same time, and the module happens to need to initialise its internal database, then one of the threads will go into an infinite recursive loop and blow your stack. What fun!

To be fair, the mimetypes module is slowly being converted into a healthy state, and this particular bug will be fixed in the next release. But in the meantime, if you need to do mimetype guesswork in Python, make sure you do it very carefully.

Continue reading...