Just a quick rant about threading. After working with Java for so long, I’d gotten used to the idea that a thread is an independent entity, which can work for you without slowing down the main body of your program. In python, that’s really not the case.
In python, a thread shares memory with the main “thread” of your code, which prevents it from running on other CPUs, or running independently. In fact, with a python thread, you’re stuck with every line of code running on the same core, with the limitation that only one line of code can be processed at a time – meaning that all your threads take turns passing lines of code to the CPU to be run. (Or, as a mental image, that’s how it’s working, the reality is a bit more subtle.)
Unfortunately, that means that python threads don’t really speed up your code, and if there are enough of them, they can slow it down significantly.
The solution turned out to be to use a module in python called “multiprocessing”, which allows you to spawn processes instead of threads, which means that each individual process can run on a different CPU (if you have enough cores…), but does not share any memory with the main process or thread of your code. Thus, you have to work out a system of thread-safe (process-safe?) queues, where each process can dump information into a buffer, allowing other threads or processes to pick up information and process it independently. The worker threads can consume the information in parallel, giving you a speed up of the running time (wall time) of your code.
All in all, it’s actually a relatively elegant system, not much different than many other languages, with the exception of the terminology – processes vs threads. Python got it right, but it took me a while to figure out that I was using the terms incorrectly.
At any rate, without any optimization, multiprocessing with 30 processes brings down the wall time of my code from about 3 hours down to about 15 minutes. It’s almost time to start looking into c code optimization… Almost. (-;