As a brief demonstration I can write:
foo(a, b, c) = (a + b) * c
And when I call it on integers, it emits only the necessary integer assembly, and when I call it on floats only the necessary float assembly, and when I broadcast it across vectors it emits SSE assembly. It's only when it can't prove the incoming types that it emits any sort of dynamic type code. It's also possible for the calling function to be ignorant of the types too, and so on, until a user decides to pass in an integer or a float, and all of the code is specialized to be as fast as possible.
As I’ve learned the language it’s become pretty easy to avoid those pitfalls even on initial implementations. That said, providing types in function signatures is still very useful for multiple dispatch and providing a more usable API in libraries.
However, as soon as you try to do something a bit more complicated then you’ll notice the speed and flexibility differences.
I prototyped a quick julia implementation of a simple glm (almost identical code in Julia and R), and the julia code was approximately 10-20 times faster depending on the model.
This is definitely worth looking at (mind you, the costs of redevelopment of our code in Julia is probably prohibitive). That being said, this would encourage me to call out to julia from R for some of my more computationally heavy workloads.
For example, a straightforward Python-to-LLVM compiler would generate code with every variable being a PyObject (https://docs.python.org/3/c-api/structures.html) instance, and “switch(obj.ob_type)” equivalents that would require a “sufficiently advanced compiler” to get to equivalent speed as, say, C.