Go doesn't do SIMD at all (see note 1). Personally I leverage Go coupled with the Intel Compiler (Go happily links with and uses very high performance C-built libraries, where I'm rocking out with SSE3 / AVX / AVX2).
To respond to something that Ptacek said above, many of us do expect Go to achieve C-level performance eventually. There is nothing stopping the Go compiler from using SIMD and automatic vectorization, it just doesn't yet. There is nothing about the language that prohibits it from a very high level of optimization, and indeed the language is generally sparse in a manner that allows for those optimizations.
*1 - For performance critical code you are supposed to use gccgo, which uses the same intermediary as the C compiler, allowing it to do all of the vectorization and the like. Unfortunately for this specific code gccgo generates terrible code, yielding a runtime that is magnitudes slower (albeit absolutely tiny). Haven't looked into why that is.