undefined | Better HN

0 pointswahern3y ago0 comments

> that compilers do not do well with this pattern

As compared to hand-written assembly or the tailcall technique you describe. But (for the benefit of onlookers) a threaded switch, especially using (switch-like) computed gotos, is still more performant than a traditional function dispatch table.

Has there been any movement in GCC wrt the tailcalls feature?

One of the limitations with computed gotos is the inability to derive the address of a label from outside the function. You always end up with some amount of superfluous conditional code for selecting the address inside the function, or indexing through a table. Several years ago when exploring this space I discovered a hack, albeit it only works with GCC (IIRC), at least as of ~10 years ago. GCC supports inline function definitions, inline functions have visibility to goto labels (notwithstanding that you're not supposed to make use of them), and most surprisingly GCC also supports attaching __attribute__((constructor)) to inline function definitions. This means you can export a map of goto labels that can be used to initialize VM data structures, permitting (in theory) more efficient direct threading.

The tailcall technique is a much more sane and profitable approach, of course.

0 comments

JonChesterfield3y ago

The goto labels can exported much more directly using inline asm. Further, inline asm can now represent control flow, so you can define the labels in inline asm and the computed jump at the end of an opcode. That's pretty robust to compiler transforms. Just looked up an interpreter in that style:

#define LABEL_START(TAG) ns_##TAG : __asm__(".p2align 3\n.Lstart_" #TAG ":" :::)

#define LABEL_END(TAG) __asm__(".Lend_" #TAG ":\n")

#define PROLOGUE(TAG) LABEL_START(TAG); ip++

#define EPILOGUE(TAG) __asm__ goto("\tjmpq %0\n" "\t.Lend_" #TAG ":\n"::"r"((void)decode(ip))::ALL_LABELS())

Followed by opcodes implemented in this fashion:

  {
    PROLOGUE(add);
    {
      apply_opcode_ADD(&s->data_stack);
    }
    EPILOGUE(add);
  }

Because the labels are defined in assembly, not in C, accessing them from outside the function is straightforward. I wrote a whole load of these at some point, there's probably a version of those macros somewhere that compiles to jumps through a C computed goto as well.

j / k navigate · click thread line to collapse