Parallelize I/O-bound tasks (HTTP requests, file reads, DB queries) without writing thread management code. The default thread count is min(32, os.cpu_count() + 4) — usually the right call.
Use processes (not threads) for CPU-bound work to sidestep the GIL. Functions must be picklable — define them at module level, not as locals or lambdas.