And for this you cannot use Python's multiprocessing because ... ? Sure, moving data between processes is slow because of pickling [0]. However, I'm using parallel processing for the things you suggested, and for these it works great.
If I really had the use case and needed threads, I'd much rather use C++ bindings in a Python package than rebuilding the whole thing. Guess it depends on the scale we are talking about.
[0] https://pythonspeed.com/articles/faster-multiprocessing-pick...