Maybe it matters less, you used to always assume bootstrap from C but that more or less died with C++ based compilers, although you can do a multistage bootstrap from the last gcc before C++ still.
Basically, you start from the last C version, and every version is supposed to be able to compile the next one.
What's the bootstrap process for the C compiler part of the compiler?
When you set out to build a programming language, what is your objective? To create a sweet new optimizer? To create a sweet new assembler? A sweet new intermediate representation? AST? Of course not. You set out to change the way programmers tell computers what to do.
So why do this insist on duplicating: (1) An intermediate representation. (2) An optimizer. (3) An assembler. (4) A linker.
And they didn't innovate in any of those areas. All those problems were solved with LLVM (and to some more difficult to interact with extent GCC). So why solve them again?
It's like saying you want to build a new car to get from SF to LA and starting by building your own roads. Why would you not focus on what you bring to the table: A cool new [compiler] front-end language. Leave turning that into bits to someone who brings innovation to that space.
This is more of a genuine question.
To quote rsc from https://news.ycombinator.com/item?id=8817990:
"It's a small toolchain that we can keep in our heads and make arbitrary changes to, quickly and easily. Honestly, if we'd built on GCC or LLVM, we'd be moving so slowly I'd probably have left the project years ago."
"For example, no standard ABIs and toolchains supported segmented stacks; we had to build that, so it was going to be incompatible from day one. If step one had been "learn the GCC or LLVM toolchains well enough to add segmented stacks", I'm not sure we'd have gotten to step two."
Their own explanation for wasting hundreds of thousands of man-hours on a "quirky and flawed" separate compiler, linker, assembler, runtime, and tools is because they absolutely needed an implementation detail that is completely invisible to programs and which they are now replacing because it wasn't a good idea in the first place (segmented stacks). And it's worth writing out a 1000 word rationalization that doesn't bother even mention the reason that implementation was necessary in the first place, to better run on 32-bit machines. In 2010.
Or they say that they had to reinvent the entire wheel, axle, cart, and horse so that five years later they could start working on a decent garbage collector. Never mind that five years later other people did the 'too hard and too slow' work on LLVM that a decent garbage collector needs. What foresight, that.
That's not sense, that's people rationalizing away wasting years of their time doing something foolish and unnecessary.
If you're building a new language, you need a new AST. You can't represent Go source code in a C++ AST.
There are alternate compilers for Go, in the form of gccgo and llgo. But those are both very slow to build (compared to the Go tree that takes ~30s to build the compiler, linker, assembler and standard library). And the "gc" Go compiler runs a lot faster than gccgo (though it doesn't produce code that's as good), and compilation speed is a big part of Go's value proposition.
For any non-Gophers reading this: I write Go as my primary language, and have for the past two and a half years. I just timed the respective compilation speeds on a handful of my larger projects using both gc and gccgo (and tested on a separate computer as well just for kicks).
gccgo was marginally slower, though not enough to be appreciable. In the case of two projects, gccgo was actually slightly faster. The Go compiler/linker/assembler/stdlib are probably larger and more complex than the most complex project on my local machine at the moment, but I think my projects are a reasonable barometer of what a typical Go programmer might expect to work with (as opposed to someone working on the Go language itself).
The more pressing issue as far as I'm concerned is that gccgo is on a different release schedule than gc (because it ships with the rest of the gcc collection). That's not to say it's not worth optimizing either compiler further when it comes to compilation speed, but it's important for people considering the two compilers to understand the sense of scale we're talking about - literally less than a second for most of my projects. Literally, the time it takes you to type 'go build' is probably more significant.
I don't doubt for one second that llgo takes a longer time to compile. And in exchange for slower compile times you benefit from many PHDs worth of optimizations in LLVM. And every single target architecture they support.
It's easy to build something faster when it does less. I'll admit there's no blanket right answer to that tradeoff.
For gcc you have to deal with MinGW. Isn't LLVM just now getting to the point where it can build native Windows applications?
This is one area where I hope Rust makes progress. MinGW/Msys2 is just kind of gross stuff to deal with.
This is a recent addition and did not exist at the time Go was created, however.
I don't think working within the capabilities of the VCS you're using should ever be a priority for a software development effort; rather I think the VCS's priority should be to allow for their use within most contexts of software development. (the other way around)
(If you want to compile with a different compiler as a check, there's an LLVM-based compiler for Go.)
Could someone, kindly, explain how future versions would be built? Thanks!
So to answer your question, this new Go-written-in-Go compiler will initially be compiled by the Go-written-in-C compiler. The output from that will be an executable Go-written-in-Go compiler, and _that_ will be used to compile itself in the future. I.e. Go compiler version 1.4 will be used to compile Go version 1.5 will be used to compile Go version 1.6...
Keep in mind that this is not at all unusual. The C compiler GCC has been compiled using older versions of GCC for a long time. Having a compiler compile itself is a sort of milestone that many languages aspire to as a way of showing that the language is "ready."
[1] https://en.wikipedia.org/wiki/Self-hosting
[2] http://blog.llvm.org/2010/02/clang-successfully-self-hosts.h...
Ok, we figured out what happened. A background process that is upgrading old stories to a new data format went rogue and made multiple copies of a few stories in memory. Apparently it agrees with some of you that HN could use more stories about Go.
Sorry for the error.
Edit: Nope. All comments show on both.
[1] http://dtrace.org/blogs/wesolows/2014/12/29/golang-is-trash/
I'm a little surprised you brought that post up to begin with. It completely misses the point, as I explained in my comment here at the time (https://news.ycombinator.com/item?id=8817990). When I wrote that response I also submitted a comment on the blog itself with a link to the HN comment. That blog comment has not yet been published. If you're going to keep sending around links to such an inflammatory blog post, could you also try to get my comment there approved?
Thanks.
SP, PC and FP are virtual registers, from the POV of the assembler. On _some_ architecture those words have real meanings, like RSP on intel, but on others they are just conventions.
I don't think Keith's rage quit has had a measurable impact on the direction of Go or its toolchain.
It can be bootstrapped from source - it just needs to be bootstrapped either using gccgo[0], or using the 1.4 compiler (which is guaranteed to work for all 1.x compilers, not just 1.5)
> Another magic binary to trust not to have a Thompson virus.
"Reflections on Trusting Trust" gets posted on HN regularly, and it's an interesting exercise, but you are far more likely to have an exploit hiding in plain sight in a compiler compiled from source once than you are to have one that only appears after multiple iterated compilations.
It's a good concept for security experts and compiler developers to be aware of, but the likelihood is incredibly small.
Also, for what it's worth, "Trusting Trust" is over three decades old, and there have been numerous response to it in the interim, with lots of study. It's like saying "Your problem reduces to 3-SAT, and satisfiability is NP-hard, so you can't solve it', throwing your hands up, and leaving it at that. In reality, solving 3-SAT in the general case is NP-hard, but it is well-studied enough that, in practice, solving SAT/3-SAT is actually pretty easy most of the time. Some of these responses have even been posted elsewhere in this thread, though they're also pretty easy to find online as well.
[0] which is written in C++ - frankly, I'd be much more concerned about a single-compilation bug in any C++ code than I'd be about a multiple-compilation bug in Go.
Though the diversity available for a go compiler written in go isn't very tremendous.