There is an argument that today's compilers are finally good enough for VLIW to go mainstream, but good luck convincing anyone in today's market to go for it.
------
A big problem with VLIW is that it's impossible to predict L1, L2, L3 or DRAM access. Meaning all loads/stores are impossible to schedule by the compiler.
NVidia has interesting barriers that get compiled into its SASS (a level lower than PTX assembly). These barriers seem to allow the compiler to assist in the dependency management process but ultimately still require a decoder in the NVidia core final level before execution.