However, Python* has an added issue: There's a Global Interpreter Lock that prevents two threads in the same process from running Python code at the same time. This means that if you have 8 cores, and change your code to use 8 threads, it won't be able to use 800% CPU and run 8x faster; it'll use the same 100% CPU and run at the same speed. (In reality, it'll run a little slower, because there's extra overhead from threading, even if you don't have any shared data, but ignore that for now.)
There are exceptions to this. If your code's heavy computation doesn't actually happen in Python, but in some library with custom C code that does proper GIL handling, like a numpy app, you will get the expected performance benefit from threading. The same is true if the heavy computation is done by some subprocess that you run and wait on.
More importantly, there are cases where this doesn't matter. For example, a network server spends most of its time reading packets off the network, and a GUI app spends most of its time waiting for user events. One reason to use threads in a network server or GUI app is to allow you to do long-running "background tasks" without stopping the main thread from continuing to service network packets or GUI events. And that works just fine with Python threads. (In technical terms, this means Python threads give you concurrency, even though they don't give you core-parallelism.)
But if you're writing a CPU-bound program in pure Python, using more threads is generally not helpful.
Using separate processes has no such problems with the GIL, because each process has its own separate GIL. Of course you still have all the same tradeoffs between threads and processes as in any other languagesit's more difficult and more expensive to share data between processes than between threads, it can be costly to run a huge number of processes or to create and destroy them frequently, etc. But the GIL weighs heavily on the balance toward processes, in a way that isn't true for, say, C or Java. So, you will find yourself using multiprocessing a lot more often in Python than you would in C or Java.
Meanwhile, Python's "batteries included" philosophy brings some good news: It's very easy to write code that can be switched back and forth between threads and processes with a one-liner change.
If you design your code in terms of self-contained "jobs" that don't share anything with other jobs (or the main program) except input and output, you can use the concurrent.futures library to write your code around a thread pool like this:
with concurrent.futures.ThreadPoolExecutor(max_workers=4) as executor:
You can even get the results of those jobs and pass them on to further jobs, wait for things in order of execution or in order of completion, etc.; read the section on Future objects for details.
Now, if it turns out that your program is constantly using 100% CPU, and adding more threads just makes it slower, then you're running into the GIL problem, so you need to switch to processes. All you have to do is change that first line:
with concurrent.futures.ProcessPoolExecutor(max_workers=4) as executor:
The only real caveat is that your jobs' arguments and return values have to be pickleable (and not take too much time or memory to pickle) to be usable cross-process. Usually this isn't a problem, but sometimes it is.
But what if your jobs can't be self-contained? If you can design your code in terms of jobs that pass messages from one to another, it's still pretty easy. You may have to use threading.Thread or multiprocessing.Process instead of relying on pools. And you will have to create queue.Queue or multiprocessing.Queue objects explicitly. (There are plenty of other optionspipes, sockets, files with flocks, but the point is, you have to do something manually if the automatic magic of an Executor is insufficient.)
But what if you can't even rely on message passing? What if you need two jobs to both mutate the same structure, and see each others' changes? In that case, you will need to do manual synchronization (locks, semaphores, conditions, etc.) and, if you want to use processes, explicit shared-memory objects to boot. This is when multithreading (or multiprocessing) gets difficult. If you can avoid it, great; if you can't, you will need to read more than someone can put into an SO answer.
From a comment, you wanted to know what's different between threads and processes in Python. Really, if you read Giulio Franco's answer and mine and all of our links, that should cover everythingbut a summary would definitely be useful, so here goes:
As a consequence of (1), sending data between processes generally requires pickling and unpickling it.**
As another consequence of (1), directly sharing data between processes generally requires putting it into low-level formats like Value, Array, and ctypes types.
Processes are not subject to the GIL.
There are some extra restrictions on processes, some of which are different on different platforms. See Programming guidelines for details.
The threading module doesn't have some of the features of the multiprocessing module. (You can use multiprocessing.dummy to get most of the missing API on top of threads, or you can use higher-level modules like concurrent.futures and not worry about it.)
thanks, but I am not sure I understood everything. Anyway I am trying to do it a bit for learning purposes, and a bit because with a naive use of thread I halved the speed of my code (starting more than 1000 threads at the same time, each calling an external app.. this saturates the cpu, yet there is a x2 increase in speed). I think managing the thread smartly might really improve the speed of my code..
@LucaCerone: Ah, if your code spends most of its time waiting on external programs, then yes, it will benefit from threading. Good point. Let me edit the answer to explain that.
@LucaCerone: Meanwhile, what parts do you not understand? Without knowing the level of knowledge you're starting with, it's hard to write a good answer but with some feedback, maybe we can come up with something that's helpful to you and to future readers as well.
@LucaCerone You should read the PEP for multiprocessing here. It gives timings and examples of threads vs multiprocessing.
@abarnert I have never studied nor implemented multi-threading multi-processing code.. and for Python I am experienced, but there is much room for improvement :) I tried to use the multiprocessing.Pool, and the Pool.map method, but I run into the non-picklable issue.. (the fun I want to apply to a list is a bound method.. I have tried several variations, read several discussions here on SO but couldn't entirely understand how to make it work)