If you read the article you can see that some codepaths can invoke Malloc with all the follow-on effects like Kernel boundary crossings that this implies, it's thus quite random.
Depends, I look at it from a performance standpoint when starting to count lines/instructions, not just directly executed code but also how feasible it would be to translate the thing to a JIT for example, the amount is large enough that going to a JIT would yield little (this is why there has been so many Python JIT's that has failed to gain enough performance and hence traction) before mayor architectural fixes are made.
Not only is there branches to a ton of special things but also macros that hides even more lines (IncRef/DecRef probably has a lot of magic behind there).