undefined | Better HN

0 pointscrawshaw8y ago0 comments

The goroutine implementation scales, while other thread implementations (by default) do not. That's a semantic difference. A Go server can have millions of active goroutines with moderate resource use.

You can achieve the same on Linux or Solaris using kernel threads, but you have to work at it. With Go you don't have to work at it, and it works on macOS and Windows and a few other OSs too.

This is all comparisons between O(1) things, but the constant factor matters.

0 comments

pcwalton8y ago

> You can achieve the same on Linux or Solaris using kernel threads, but you have to work at it.

By setting the thread stack size to a reasonable value. That's it. And, in fact, on 64-bit you often don't even need to do that.

The difference you're describing is a difference in default thread stack sizes, which is hardly a paradigm shift. We're talking about one call to pthread_attr_setstacksize().

crawshawOP8y ago

It's not nearly as simple as you claim.

First: if you have an epoll loop it is also the cost of the thread context switch, which has definitely us in RPC systems using kernel threads. By contrast the goroutine gets scheduled onto the kernel thread that answered the poll, saving the switch.

Second: as I alluded to earlier, linux and solaris can scale their kernel thread implementations, not all OSs can. My experiences with large numbers of threads on the BSDs and Windows (in years past admittedly) suggest other kernels don't have thread implementations designed to scale to such high numbers. Solving the problem in userspace means Go programs written in this style are portable across operating systems.

Third: you can only adjust stack sizes down if you know your program always keeps its stacks small. If you depend on libraries you don't own in C/C++, that's a difficult assumption. Go grows the stacks, so if you hit some corner case where a small number of goroutines need some significant amount of stack, your program uses more memory, but typically keeps working. No need for careful (manual!) stack accounting.

If all this were as easy as you say, we would still write nearly all our C/C++ servers using threads. We don't because it's not.

pcwalton8y ago

> First: if you have an epoll loop it is also the cost of the thread context switch, which has definitely us in RPC systems using kernel threads. By contrast the goroutine gets scheduled onto the kernel thread that answered the poll, saving the switch.

I'm not comparing M:N to a 1:1 system where all I/O is proxied out to another thread sitting in an epoll loop. I'm comparing M:N to 1:1 with blocking I/O. In this scenario, the kernel switches directly onto the appropriate thread.

> Second: as I alluded to earlier, linux and solaris can scale their kernel thread implementations, not all OSs can.

The vast majority of Go users are running Linux. And on Windows, UMS is 1:1 and is the preferred way to do high-performance servers; it avoids a lot of the problems that Go has (for instance, playing nicely with third-party code).

> Third: you can only adjust stack sizes down if you know your program always keeps its stacks small.

You could do 1:1 with stack growth just as Go does. As I've said before, small stacks are a property of the relocatable GC, not a property of the thread implementation.

> If all this were as easy as you say, we would still write nearly all our C/C++ servers using threads.

We don't write C/C++ servers using threads because (1) stackless use of epoll is faster than both 1:1 threading and M:N threading, as this project shows; (2) C/C++ can't do relocatable stacks, as the language is hostile to precise moving GC.

1 more reply

j / k navigate · click thread line to collapse

0 comments

pcwalton8y ago

> You can achieve the same on Linux or Solaris using kernel threads, but you have to work at it.

By setting the thread stack size to a reasonable value. That's it. And, in fact, on 64-bit you often don't even need to do that.

The difference you're describing is a difference in default thread stack sizes, which is hardly a paradigm shift. We're talking about one call to pthread_attr_setstacksize().

crawshawOP8y ago

It's not nearly as simple as you claim.

If all this were as easy as you say, we would still write nearly all our C/C++ servers using threads. We don't because it's not.

pcwalton8y ago

> Second: as I alluded to earlier, linux and solaris can scale their kernel thread implementations, not all OSs can.

> Third: you can only adjust stack sizes down if you know your program always keeps its stacks small.

You could do 1:1 with stack growth just as Go does. As I've said before, small stacks are a property of the relocatable GC, not a property of the thread implementation.

> If all this were as easy as you say, we would still write nearly all our C/C++ servers using threads.

1 more reply

j / k navigate · click thread line to collapse