If you really wanted full tail-call behavior, you would either have to compile every function into the same mega-function in a loop and have some sort of if/else tree, or use trampolines (which would additionally require either storing parameters in memory somewhere or using the same type signature for all functions, since function calls are typed).
Overall, it's not a super great situation for true tail-call elimination. For now, I've implemented limited tail-call elimination for single-function recursive calls (transforming them into an in-function loop), and that's patched things up enough for me to continue working for now until I either need to come up with an optimized trampoline or the tail_call instruction finally gets standardized.