Show HN: QBE – a new compiler back end (opens in new tab)

(c9x.me)

241 pointsmpu10y ago68 comments

68 comments

> Implementing SSA construction is hard. To save its users from having to implement it, LLVM provides stack slots. This means that one increment of a variable v will be composed of three LLVM instructions: one load, one add, and one store.

This is just wrong, though! LLVM provides stack slots for a good reason: stack allocated variables can escape. You need a place in memory to stick them. If mem2reg can prove that the stack allocated variable doesn't escape, it gets promoted into an SSA value.

mpuOP10y ago

Then maybe I did not express myself in the best terms. QBE definitely supports stack slots and their registerization! Minic, a small C frontend shipped with QBE makes use of them.

The difference is that LLVM forces you to use them even when you know your locals do not escape (i.e. your source language is Pascal), QBE doesn't.

So, LLVM makes you use stack slots for two independent problems: 1. Compiling languages like C where locals can escape, and 2. Avoiding to construct SSA form in the frontend. In QBE, you use stack slots (alloc4, alloc8) to solve 1, but to solve 2, you can simply emit non-ssa form and QBE will fixup things for you.

munin10y ago

> you can simply emit non-ssa form and QBE will fixup things for you

That's exactly what LLVM does though, except the "non-SSA form" involves loads/stores to alloca'd values. The difference is that in LLVM, the language is always in SSA form, it just has some reads / writes to memory that can be pruned, while QBE alternates between being an SSA language and not an SSA language.

LLVM also doesn't "make" you use stack slots. If you wanted to, you could emit fully pruned programs as LLVM IR. Using allocas for variables is a choice that clang makes.

mpuOP10y ago

I think the extra load/stores clutter the IL.

Also, QBE does not really "alternate" SSA/non-SSA, SSA form is built once at the beginning of the compilation pipeline and preserved later.

I don't understand what you mean by "fully pruned programs". Maybe you want to refer to pruned SSA form. And then, here is my point: with LLVM, either you build SSA yourself or you use allocas. QBE offers a convenient third option.

3 more replies

jklontz10y ago

Yes! And I would also add that:

> LLVM IL is more cluttered with type annotations and casts.

With LLVM moving to typeless pointers[1], this will be less true in the future than it is now.

[1] https://groups.google.com/forum/#!topic/llvm-dev/vphBegBWyyE

munin10y ago

I don't know if that would alleviate the authors concern, because (as I understand that proposal) the type information is still present, it's just bolted into the load/store/GEP instructions now.

Nothing is stopping you from writing LLVM that doesn't use types though! That's entirely a front end decision. If you wanted to, in your front end, you could never emit record or array types and do book keeping on cells in memory entirely on your own using pointer arithmetic / pointertoint / inttopointer. Doing so is entirely at the expense of the speed of the generated code though!

1 more reply

mpuOP10y ago

I take this as a compliment. It means my design choice was totally valid, and maybe even a good one.

andrewchambers10y ago

qbe allows both stack slots like llvm, or directly inputting non ssa code which is automatically turned to ssa form.

gravypod10y ago

I've always found this sort of "I'm going to make my own" approche as being the best for innovation. I really hope this goes through and continues development.

mpuOP10y ago

Thank you for your words. It is often called NIH, but eh, I learned a lot! And I think that I made some modest improvements over LLVM, you can check them out in my comparison at http://c9x.me/compile/doc/llvm.html

tinco10y ago

Looks good! I know your target is 70% of the performance, but is there any fundamental reason to QBE that it couldn't be more? Suppose I ported my compiler from LLVM to QBE (I think I could do so with not too much effort) would at some point I be able to work on porting some of LLVMs optimizing phases to QBE to get my performance up to par, or is there a design decision you made that will get in the way of the last 30%?

1 more reply

gnuvince10y ago

This is a wonderful idea! As a compiler person, I'm excited to see how well this works and wish you all the success in the world.

> QBE aims to be a pure C embeddable backend that provides 70% of the performance of advanced compilers in 10% of the code.

This philosophy is very similar to that of vis [1], a vim-like editor that aims to have 80% of vim's functionality in 1% of the code. I hope that more such project see the light of the day; I'm very interested in simpler programs and simpler ecosystems, even if that comes at the cost of features.

[1] https://github.com/martanne/vis

sitkack10y ago

In the first version we learn and discover, edits and rewrites.

In the second version we build with symmetry and regularity.

We need both versions, and in the second we drop some corner cases to make things crystalline. It is important to recognize the things we drop and why, sometimes they are important, but the purity of the construction takes precedent. And sometimes the corner cases are the thing, without them the chandelier would cast no light.

Anyway, I don't know where I am going with this. Small, playful, understandable systems are the most informative. We need playful, useful models that can be used for experimentation in a momentum free way. And it is important not to suffocate the future with exhaustive engineering. Recognize the is.

marmaduke10y ago

This is a great point; implementations always compromise on speed, generality, domain decomposition, flexibility, ease of use/debugging, compile time, etc...

It's always a compromise subject to design goals of the project or just personal preferences. There are a lot of reads on the Internet about specific compromises chosen for a technology but contrasting choices between projects or versions of a project are difficult to find.

Also I think large systems are just as informative as smaller ones but require more investment.

jiyinyiyong10y ago

Yeah, and 70% of features of React in 5% of the code!

nickpsecurity10y ago

Interesting project mpu. I like the stuff here:

http://c9x.me/compile/doc/llvm.html

Especially the optimizations that get you the first 70%. I've been asking optimization people as I see them to do a survey of various methods to find the smallest collection that provide the hugest benefit. This will help people writing new compilers and formal verification community get a head start.

Now, back to correctness. Your project aims to be small, simple, and rigorous. That's perfect time to use methods for higher assurance software or design yours to work with them easily. So, I suggest trying Design-by-Contract w/ interface checks, some static analysis tools, or even something like Softbound+CETS that makes C safer automatically. I also usually recommend writing compilers in something like Ocaml as it prevents lots of bugs and easier to do.

Also, if you do Ocaml or safe imperative, you can always do two implementations side-by-side in the safe language and in portable C. You run tests, analysis, etc available for each one. Problems in one might be problems in the other. Regardless, though, you get the benefits of the safer language by default with a C implementation if you absolutely need it. Many of us have used this with Sandia Labs even doing it for hardware with ML and hardware language of a Java processor that came out great.

So, I challenge you to try to push the envelope by using some available techniques and tools to boost the assurance of your code. Preferably getting it off C, too, due to all the errors it (esp pointers) introduce into the compilers.

andrewchambers10y ago

My qbe C frontend is actually written in myrddin which is a lot like rust/ocaml.

To be honest, C is not well suited to places where there is adversarial input such as servers, but when it comes to logical correctness of code I do not see amazing benefits from languages like ocaml.

nickpsecurity10y ago

I'd usually ignore the language. Yet, given your interesting resume, there could be some practical decisions and decent code in the compiler worthy of a look in near future. ;) It indeed has some benefits from Ocaml and safer than plain C. So, good work covering that detail.

I'll try to address this, though:

"but when it comes to logical correctness of code I do not see amazing benefits from languages like ocaml."

First, why Ocaml is good for compilers. Most of this is still true and why Rust team used it:

http://flint.cs.yale.edu/cs421/case-for-ml.html

What I can't overstate, already in that link, is how much easier it is to verify the properties of ML languages. They were originally designed to write a theorem prover IIRC. SML had a formal semantics for easier modeling. It allowed for functional programming style that avoids state management issues while (esp Ocaml) still lets you get hands dirty where needed. Modifications for tracking information flow, dependent types, concurrency... all sorts of things were pretty straight-forward. Had a certifying compiler early. Relevant to logical correctness, it's easier to map functional specifications to a ML than a mess like C. This was proven when people doused Leroy et al with praise over first, verified compiler for C because it really was that hard. Likewise seL4 matching functional specs to C kernel code took man-years of effort.

So, languages like C are really hard to get logically correct. Languages like SML/Ocaml are pretty easy to get it done right if you understand the problem. Rarely the language causing your issues. A hybrid like a modified C or Modula that has ML-like advantages without C's disadvantages brings you closer to that ideal while preserving performance & control.

1 more reply

eatonphil10y ago

Any suggested reading material on writing two implementations side by side like that?

nickpsecurity10y ago

Hansen et al were first I think http://brinch-hansen.net/papers/

Notes: He sprinkles references to method in various papers. Design Principles is one I think. He says the work like COBOL compiler and RC 4000 was coded in assembler. Yet, he and compiler group wrote it by hand in ALGOL to better analyze, structure, and extend these systems. So, an ALGOL form for understanding and hand-written assembler for optimal execution.

Gypsy Verification Environment for high-assurance security - Overview and Example https://www.cs.rice.edu/~dwallach/courses/comp527_s99/gypsy-... ftp://ftp.cs.utexas.edu/pub/AI-Lab/tech-reports/UT-AI-TR-78-11.pdf

Notes: Gypsy was one of more developed methods for high-assurance security back in the day. Used in A1-class SCOMP system and more. The usage in A1 was to do two versions of system in parallel: one in Gypsy for use with provers, info flow analysis, and so on; one in actual code that would run on system. Problems in one could often tell you something about the other.

4GL combining BASIC, LISP, and C http://lost-in-triple-HD-failure :(

Notes: This was mine cuz I hated C and C++ but liked industrial BASIC's I started with. Made a dumb, but effective, translator from one BASIC to C and/or C++ (memory fuzzy there). Let me iterate rapidly & safely then have portable C + safety checks automatically in when done. Started adding custom commands 4GL-style to eliminate boilerplate. Upon finding LISP, but not really learning it, I ported tools so I basically coded a 4GL BASIC in LISP w/ per-function compilation, easier parsing, macros for easy 4GL stuff, and extraction to C or C++. Always two forms of the code in place: one for development/analysis; one for production. Crazy RAD pace w/ usually safe, efficient results. I really miss my tool as I'm too brain-damaged to recreate it now. Nim has many of its attributes, though, so there's hope. :)

Adam Chlipala's Certified Programming book/work and Bedrock http://adam.chlipala.net/

Notes: This method is quite loaded. You can use a theorem prover and dependently-typed language to express your program in a way that catches all types of problems. Usually extracts to ML that you might hand-optimize. Alternatively, reminiscient of Hansen, you can do low-level coding in Coq via Bedrock that's basically a safer, assembly language. Microsoft's Coq ASM and another's TALC typed ASM do similar stuff.

seL4 verification method http://concrete-semantics.org/

Notes: Someone told me above book teaches you what skills they used. Google AutoCorres for C-to-HOL extraction part. Anyway, their method is to develop in Haskell and C simultaneously for separation kernel. The Haskell forms the abstract specification that easily feeds into HOL for analysis of basic function and security. The C obviously is the running code. They go further than Gypsy by using AutoCorres to extract HOL from C then proving Haskell and C HOL specs are equivalent. Might look for C-level problems, too, but don't recall.

Bootsafe - Develop Forth firmware in Java for safety http://slideshowes.com/doc/437272/efficient-code-certificati...

Notes: One of military's funded projects to improve security. Open Firmware specifically. Many firmware, including OF, is in the flexible and fast but unsafe language Forth. The Java language is safer, easy to analyze, easy for experienced to compile, and has no business in firmware w/ its big-ass runtime. :) BootSafe lets security engineers code and analyze firmware in Java then extract that in a verified way to executable Forth. Online presentations are all partial and crappy w/ this most detailed I found. It's a finished product from atcorp.com, though. Notice one could manually do this, too, w/ side-by-side developments with C, C++, Java, or SPARK. Further evidence is that many embedded MCU developers code in C or C++ with mockups of MCU assembler or interfaces. Claim much better results in maintenance & defect reduction.

Another old one of mine - safe or fast? choose two (no website link)

Notes: The idea was that many development or analysis tools should be implemented in as safe and sound a fashion as possible. However, that usually meant sloooow. Development, to maximize flow, needs to give immediate results to programmer if possible. So, for each tool or app, build two versions of it with logical equivalence: safe and fast. The fast is used for day-to-day development. The safe one is used overnight or if problems show up in fast that are resisting debugging (eg might be in compiler/lib). Example combos included MLton with FLINT or CakeML compilers, Racket dialect with PreScheme/VLISP, and CompCert C and/or Softbound+CETS with GCC/LLVM. Batches run overnight to create safe versions of any changes or commits. Batches also re-run tests from fast versions to ensure they match safe versions.

Another of my proposals - C, Java, Ada/SPARK, and ML/Haskell (no website link)

Notes: There's many static, dynamic, and formal analysis tools for C that catch specific types of stuff. Same for Java including basically all the great analyzers for concurrency errors. The best pushbutton analysis is in SPARK for Ada which also has design-by-contract for interface checks. There are ML's with dependent types, easy covert channel analysis, easier formal verification, and so on with Haskell having similar tools. One of my concepts... for where it matters most... was implementing algorithm or system in at least two of these simultaneously with a max of four. In lowest common denominator and consistent way of course. Then, apply at least two of best tools for each language to system. Any flaws found in one are checked in others. Result, from eyeballing to static analysis to auto-testing, will probably be the most thorough assurance you can get without being a formal verification expert. You can do that too, though. Not sure how practical it is. I think it would be easy with 4-5 person team of specialists w/ good communication with probably interesting results.

riscy10y ago

At times I feel that LLVM is quite a monster, and I really like your goals of keeping things small & simple. Kudos!

Can you expand on your point about "LLVM does NOT provide full C compatibility for you"? I have specifically been hacking on calling conventions/ABI stuff recently in LLVM and this "well known" problem is news to me.

andrewchambers10y ago

In llvm you cannot just pass or return structs, every frontend needs to explicitly handle the details of when and how to registerize structs to handle the system V abi for example.

That code is not so trivial to do yourself actually.

riscy10y ago

LLVM supports passing/returning structs (I'm using 3.8.1): https://ghostbin.com/paste/ozsh3

Furthermore, the output does match the System V ABI: https://ghostbin.com/paste/4a5ms

Notice how the struct is placed on the stack and the pointer to it is placed in %rdi for call/return. If you reduce the number of i32's in the struct to 3, the struct's fields are passed via registers since the type fewer than four eightbytes.

My older clang does indeed produce wonky LLVM IR code that seems to try and do this classification in the frontend, so ABI compatibility may have been a problem in the past, but I'm not convinced it's a problem in current versions of LLVM.

2 more replies

npx10y ago

How does this compare to libfirm[1]?

1. http://pp.ipd.kit.edu/firm/

mpuOP10y ago

It's much much smaller (I think libfirm is over 100kloc, QBE is about 6k).

But the major difference is the IL: I use a human-readable and easily-printable text IL. This means that you don't need a graph-viewing tool to read the IL (it's just text) and that you can modify the IL between two passes super easily. This simple IL is a blessing when debugging a compiler.

I think QBE also has better support for the x64 ABI.

Finally, it is much less advanced (less optimizations, less tested) than libfirm and supports only x64 as a target.

This is a sketchy comparison.

swah10y ago

Its probably also easy for you to output a symbol-file to be used by IDE/editors, right?

mioelnir10y ago

This confuses me. The title says it's an alternative, but the text says it does not intend to solve all problems of industry-grade languages.

Which to me reads as Qbe being entirely differently scoped, and therefor not an alternative.

mpuOP10y ago

It's an alternative if you fit in the use case. I did not try to clone LLVM.

dang10y ago

We've taken 'LLVM' out of the title above, since experience has shown that discussions about titles tend to be off-topic and/or shallow. We also added 'Show HN' since this is your own work. Good luck!

1 more reply

chubot10y ago

Yeah, I wouldn't call it an LLVM alternative at all, because it doesn't support C++. But there is a comparison here:

http://c9x.me/compile/doc/llvm.html

nickysielicki10y ago

Here's an excerpt from a 2013 OpenBSD mailing list post about the state of compilers, whose points still are still valid in 2016, and I think it has some relevance to this submission.

    Assuming the upstream developers fail to deliver, it's up to us to fix
    or workaround compiler problems as we encounter them; sometimes it's as
    easy as finding out which patch has been commited upstream, but not
    backported to the version we use; and sometimes it's a genuine issue
    which may or may not have been reported in the latest compiler version,
    and we are on our own. When this happens, we can only rely upon our
    developer skills and intimacy with the compiler.

    A few of our developers have, over the years, become unafraid of gcc,
    and able to investigate issues, backport fixes, and fix or work around
    bugs: I'll only mention niklas@, espie@, etoh@ and otto@, and hope the
    few others will forgive me for not listing their names. This has not
    been an easy road, to say the least. Now, another few of our developers
    are working on building a similar knowledge of llvm. I wish them a lot
    of luck, and I will try to join them in the near future.

    In the meantime I am not sure they feel confident enough to support
    switching the most popular OpenBSD platforms from gcc to llvm.

    In a few months or years from now, things will be different...

    ...but there is something I wish would happen first.

    An LTS release of an open source compiler.
    Because all compilers nowadays are full of subtle bugs, but so many of
    them than you can't avoid them as soon as you compile any nontrivial
    piece of code, and because we can't afford to going back to assembly, we
    need a compiler we can trust.

    GCC, as well as LLVM, have Fortune 500 companies backing them, paying
    smart developers to work fulltime on these projects.

    Yet none of them dares to provide a long time support version. Bugs in
    version N are fixed in version N+1, but new bugs are introduced. And
    noone cares about trying to settle things down and produce a compiler
    one can trust (because version N+1 runs 3.14% faster in the loonystones
    benchmark which doesn't match any real life use case). Who cares?
    Tomorrow's compiler will generate code which will complete an infinite
    loop in less than 5 seconds; stay tuned for more accomplishments!

    The free software world needs an LTS compiler. The last de-facto LTS
    compiler we have had was gcc 2.7.2.1, and it is too old to compile
    modern C and C++ code.

    Should a free software LTS compiler appear (be it a gcc fork, or an llvm
    fork, or something else), then OpenBSD would consider using it, very
    seriously. And we probably wouldn't be the only free software project
    doing so.

-- http://marc.info/?l=openbsd-misc&m=137530560232232

( Previous discussion of this post on HN: https://news.ycombinator.com/item?id=9322259 )

I like that this project is in-line with these same goals.

danbruc10y ago

QBE [1] is also a query language, Query by Example, developed at about the same time as SQL in the 1970s.

[1] https://en.wikipedia.org/wiki/Query_by_Example

Ace1710y ago

Can QBE do coroutines (e.g 'yield')? Or maybe it's a front-end issue only?

ante_annum10y ago

coroutines and yielding are a pretty high-level concept that I wouldn't expect to see in a backend, but there's a really great explanation of how to map them to low-level concepts here: http://llvm.lyngvig.org/Articles/Mapping-High-Level-Construc...

andrewchambers10y ago

the same author actually has a tutorial on how to write green threads yourself http://c9x.me/art/gthreads/intro.html

azakai10y ago

Any benchmarks of compile times vs other compilers?

andrewchambers10y ago

I have done some using my C frontend, qbe is at least 5 times faster than both gcc and clang for -O2 -S. It is extremely fast from my experience.

hobo_mark10y ago

> Very good compatibility with C code

How much harder would it be to call C++?

ori_b10y ago

The same difficulty,more or less. Mangle in your front-end. The problem comes from the fact that half of C++ lives in headers in the form of templates, so there isn't any code to call, from the perspective of a compiler back end.

To link against code using things like the STL, your compiler would basically need to understand how to compile C++.

Same problem as macros in C, but macros are usually less pervasive in C code, meaning that often you get a good enough binding by ignoring them or filling in wrapper functions by hand.

KenoFischer10y ago

Depends how much of C++ you need. In full generality you need a C++ compiler in the calling compilation unit (inlining, template instantiation, etc.). The name mangling and ABI itself really isn't that hard.

azakai10y ago

It might be interesting to use the clang frontend and modify it to emit QBE instead of LLVM. I believe Visual Studio and ICC do that.

andrewchambers10y ago

One thought i had would be make a qbe backend for llvm, just as a temporary solution until there are faster smaller frontends that can replace them.

marmaduke10y ago

If thé IL is text, and your goal is short understandable code, why not use a language like Python?

(Compiling it with Cython would allow you to provide a C api and library with only lib python as a dep)

andrewchambers10y ago

that would be painfully slow, as it stands now, qbe is actually quite a lot faster than gcc and clang.

marmaduke10y ago

So the 70% performance was about compiler performance, not compiled code performance?

1 more reply

Drup10y ago

> Its small size serves both its aspirations of correctness ...

No, a compiler (hell, a software) written in C is not correct, period. If you want a C compiler that has slight chances to be correct, you use compcert.

The small size is an awesome property for education (both for the writer and the reader). For this purpose, it's absolutely great, so kudos for that. :)

mpuOP10y ago

At least, I can try!

And also, we are seeing more and more certified C programs: see the DeepSpec NSF expedition grant, the Verified Software Toolchain, and the CertiKOS project for examples. I work with these guys.

nickpsecurity10y ago

You are soooo lucky. DeepSpec has a near dream team of people working on this issue. Appel and Chlipala alone could probably knock out most of the problem given enough time. Add the others and great stuff on publication list is entirely unsurprising. Except in its cleverness. :) Glad you brought it up as I haven't read the info flow for c & asm paper yet.

Btw, hows progress coming on those projects? Specifically, are any of the tools (a) useful for non-experts with a little bit of training by tutorials, etc and (b) available for download in open-source or binary form yet? Thanks ahead of time.

andrewchambers10y ago

Not to mention qbe is much faster than both clang and gcc when it comes to compiling things.

j / k navigate · click thread line to collapse

68 comments

munin10y ago

mpuOP10y ago

Then maybe I did not express myself in the best terms. QBE definitely supports stack slots and their registerization! Minic, a small C frontend shipped with QBE makes use of them.

The difference is that LLVM forces you to use them even when you know your locals do not escape (i.e. your source language is Pascal), QBE doesn't.

munin10y ago

> you can simply emit non-ssa form and QBE will fixup things for you

LLVM also doesn't "make" you use stack slots. If you wanted to, you could emit fully pruned programs as LLVM IR. Using allocas for variables is a choice that clang makes.

mpuOP10y ago

I think the extra load/stores clutter the IL.

Also, QBE does not really "alternate" SSA/non-SSA, SSA form is built once at the beginning of the compilation pipeline and preserved later.

3 more replies

jklontz10y ago

Yes! And I would also add that:

> LLVM IL is more cluttered with type annotations and casts.

With LLVM moving to typeless pointers[1], this will be less true in the future than it is now.

[1] https://groups.google.com/forum/#!topic/llvm-dev/vphBegBWyyE

munin10y ago

I don't know if that would alleviate the authors concern, because (as I understand that proposal) the type information is still present, it's just bolted into the load/store/GEP instructions now.

1 more reply

mpuOP10y ago

I take this as a compliment. It means my design choice was totally valid, and maybe even a good one.

andrewchambers10y ago

qbe allows both stack slots like llvm, or directly inputting non ssa code which is automatically turned to ssa form.

gravypod10y ago

I've always found this sort of "I'm going to make my own" approche as being the best for innovation. I really hope this goes through and continues development.

mpuOP10y ago

tinco10y ago

1 more reply

gnuvince10y ago

This is a wonderful idea! As a compiler person, I'm excited to see how well this works and wish you all the success in the world.

> QBE aims to be a pure C embeddable backend that provides 70% of the performance of advanced compilers in 10% of the code.

[1] https://github.com/martanne/vis

sitkack10y ago

In the first version we learn and discover, edits and rewrites.

In the second version we build with symmetry and regularity.

marmaduke10y ago

This is a great point; implementations always compromise on speed, generality, domain decomposition, flexibility, ease of use/debugging, compile time, etc...

Also I think large systems are just as informative as smaller ones but require more investment.

jiyinyiyong10y ago

Yeah, and 70% of features of React in 5% of the code!

nickpsecurity10y ago

Interesting project mpu. I like the stuff here:

http://c9x.me/compile/doc/llvm.html

andrewchambers10y ago

My qbe C frontend is actually written in myrddin which is a lot like rust/ocaml.

nickpsecurity10y ago

I'll try to address this, though:

"but when it comes to logical correctness of code I do not see amazing benefits from languages like ocaml."

First, why Ocaml is good for compilers. Most of this is still true and why Rust team used it:

http://flint.cs.yale.edu/cs421/case-for-ml.html

1 more reply

eatonphil10y ago

Any suggested reading material on writing two implementations side by side like that?

nickpsecurity10y ago

Hansen et al were first I think http://brinch-hansen.net/papers/

4GL combining BASIC, LISP, and C http://lost-in-triple-HD-failure :(

Adam Chlipala's Certified Programming book/work and Bedrock http://adam.chlipala.net/

seL4 verification method http://concrete-semantics.org/

Bootsafe - Develop Forth firmware in Java for safety http://slideshowes.com/doc/437272/efficient-code-certificati...

Another old one of mine - safe or fast? choose two (no website link)

Another of my proposals - C, Java, Ada/SPARK, and ML/Haskell (no website link)

riscy10y ago

At times I feel that LLVM is quite a monster, and I really like your goals of keeping things small & simple. Kudos!

andrewchambers10y ago

In llvm you cannot just pass or return structs, every frontend needs to explicitly handle the details of when and how to registerize structs to handle the system V abi for example.

That code is not so trivial to do yourself actually.

riscy10y ago

LLVM supports passing/returning structs (I'm using 3.8.1): https://ghostbin.com/paste/ozsh3

Furthermore, the output does match the System V ABI: https://ghostbin.com/paste/4a5ms

2 more replies

npx10y ago

How does this compare to libfirm[1]?

1. http://pp.ipd.kit.edu/firm/

mpuOP10y ago

It's much much smaller (I think libfirm is over 100kloc, QBE is about 6k).

I think QBE also has better support for the x64 ABI.

Finally, it is much less advanced (less optimizations, less tested) than libfirm and supports only x64 as a target.

This is a sketchy comparison.

swah10y ago

Its probably also easy for you to output a symbol-file to be used by IDE/editors, right?

mioelnir10y ago

This confuses me. The title says it's an alternative, but the text says it does not intend to solve all problems of industry-grade languages.

Which to me reads as Qbe being entirely differently scoped, and therefor not an alternative.

mpuOP10y ago

It's an alternative if you fit in the use case. I did not try to clone LLVM.

dang10y ago

1 more reply

chubot10y ago

Yeah, I wouldn't call it an LLVM alternative at all, because it doesn't support C++. But there is a comparison here:

http://c9x.me/compile/doc/llvm.html

nickysielicki10y ago

Here's an excerpt from a 2013 OpenBSD mailing list post about the state of compilers, whose points still are still valid in 2016, and I think it has some relevance to this submission.

    Assuming the upstream developers fail to deliver, it's up to us to fix
    or workaround compiler problems as we encounter them; sometimes it's as
    easy as finding out which patch has been commited upstream, but not
    backported to the version we use; and sometimes it's a genuine issue
    which may or may not have been reported in the latest compiler version,
    and we are on our own. When this happens, we can only rely upon our
    developer skills and intimacy with the compiler.

    A few of our developers have, over the years, become unafraid of gcc,
    and able to investigate issues, backport fixes, and fix or work around
    bugs: I'll only mention niklas@, espie@, etoh@ and otto@, and hope the
    few others will forgive me for not listing their names. This has not
    been an easy road, to say the least. Now, another few of our developers
    are working on building a similar knowledge of llvm. I wish them a lot
    of luck, and I will try to join them in the near future.

    In the meantime I am not sure they feel confident enough to support
    switching the most popular OpenBSD platforms from gcc to llvm.

    In a few months or years from now, things will be different...

    ...but there is something I wish would happen first.

    An LTS release of an open source compiler.
    Because all compilers nowadays are full of subtle bugs, but so many of
    them than you can't avoid them as soon as you compile any nontrivial
    piece of code, and because we can't afford to going back to assembly, we
    need a compiler we can trust.

    GCC, as well as LLVM, have Fortune 500 companies backing them, paying
    smart developers to work fulltime on these projects.

    Yet none of them dares to provide a long time support version. Bugs in
    version N are fixed in version N+1, but new bugs are introduced. And
    noone cares about trying to settle things down and produce a compiler
    one can trust (because version N+1 runs 3.14% faster in the loonystones
    benchmark which doesn't match any real life use case). Who cares?
    Tomorrow's compiler will generate code which will complete an infinite
    loop in less than 5 seconds; stay tuned for more accomplishments!

    The free software world needs an LTS compiler. The last de-facto LTS
    compiler we have had was gcc 2.7.2.1, and it is too old to compile
    modern C and C++ code.

    Should a free software LTS compiler appear (be it a gcc fork, or an llvm
    fork, or something else), then OpenBSD would consider using it, very
    seriously. And we probably wouldn't be the only free software project
    doing so.

-- http://marc.info/?l=openbsd-misc&m=137530560232232

( Previous discussion of this post on HN: https://news.ycombinator.com/item?id=9322259 )

I like that this project is in-line with these same goals.

danbruc10y ago

QBE [1] is also a query language, Query by Example, developed at about the same time as SQL in the 1970s.

[1] https://en.wikipedia.org/wiki/Query_by_Example

Ace1710y ago

Can QBE do coroutines (e.g 'yield')? Or maybe it's a front-end issue only?

ante_annum10y ago

andrewchambers10y ago

the same author actually has a tutorial on how to write green threads yourself http://c9x.me/art/gthreads/intro.html

azakai10y ago

Any benchmarks of compile times vs other compilers?

andrewchambers10y ago

I have done some using my C frontend, qbe is at least 5 times faster than both gcc and clang for -O2 -S. It is extremely fast from my experience.

hobo_mark10y ago

> Very good compatibility with C code

How much harder would it be to call C++?

ori_b10y ago

To link against code using things like the STL, your compiler would basically need to understand how to compile C++.

Same problem as macros in C, but macros are usually less pervasive in C code, meaning that often you get a good enough binding by ignoring them or filling in wrapper functions by hand.

KenoFischer10y ago

azakai10y ago

It might be interesting to use the clang frontend and modify it to emit QBE instead of LLVM. I believe Visual Studio and ICC do that.

andrewchambers10y ago

One thought i had would be make a qbe backend for llvm, just as a temporary solution until there are faster smaller frontends that can replace them.

marmaduke10y ago

If thé IL is text, and your goal is short understandable code, why not use a language like Python?

(Compiling it with Cython would allow you to provide a C api and library with only lib python as a dep)

andrewchambers10y ago

that would be painfully slow, as it stands now, qbe is actually quite a lot faster than gcc and clang.

marmaduke10y ago

So the 70% performance was about compiler performance, not compiled code performance?

1 more reply

Drup10y ago

> Its small size serves both its aspirations of correctness ...

No, a compiler (hell, a software) written in C is not correct, period. If you want a C compiler that has slight chances to be correct, you use compcert.

The small size is an awesome property for education (both for the writer and the reader). For this purpose, it's absolutely great, so kudos for that. :)

mpuOP10y ago

At least, I can try!

And also, we are seeing more and more certified C programs: see the DeepSpec NSF expedition grant, the Verified Software Toolchain, and the CertiKOS project for examples. I work with these guys.

nickpsecurity10y ago

andrewchambers10y ago

Not to mention qbe is much faster than both clang and gcc when it comes to compiling things.

j / k navigate · click thread line to collapse