undefined | Better HN

0 pointsarghwhat8y ago0 comments

This is a terrible microbenchmark.

First of all, you're only benchmarking the time it takes for fork(2) to return in the parent subshell, nothing else. The new processes don't exist yet at this point, and certainly hasn't exec'd (which tends to be why you're forking).

Second, you're not measuring the cost at all. The forked children will, at some point, start executing on other CPUs, which includes finishing configuration and running exec, which takes time. The cost is the total cycles it takes before the child is executing the intended code.

Fork is damn expensive, but whether they're too expensive depends on the usecase, and the cost of expanding hardware.

Fork time scales with the virtual memory of the forking process, and you're forking from a fresh subshell that hardly has anything allocated. It's even mentioned in the linked post that their issue stemmed from this (specifically fork lock contention spiking as fork time increased).

0 comments

paulddraper8y ago

(1) The benchmark measured the point of discussion.

(2) Even not using asynchronity (which Go is heralded for), processes take <2ms to start and stop. Not nothing, but certainly something you could do hundreds of times a second.

    $ time seq 1000 | while read; do sleep 0; done

    real        0m1.644s
    user        0m1.065s
    sys         0m0.672s

arghwhatOP8y ago

1. No, the point was that fork(2)+exec(3)/spawning processes is an expensive way to run code, not how long it takes for the parent to be able to do something else.

2. Your new benchmark is better. However, it is still a useless microbenchmark, as it is an unrealistic best-case scenario. Your spawn of sleep is happening within a fresh subshell started by the pipe you made. fork(2) depends on things like VMM size and open file descriptors of the parent process, and your subshell basically has nothing at all. A real application likely holds at least a few gigabytes of virtual memory (more likely tens of gigabytes—note that virtual memory isn't the same as resident memory), which will make fork(2) take much longer, split between parent and child.

I suspect you might be confusing asynchronicity with concurrency or parallelism. Go is heralded for concurrency, sometimes in the form of parallelism, but not asynchronicity. Concurrency does not have any positive effect on execution time or cost. Parallelism can reduce execution time, but does not decrease execution cost, it simply throws more hardware at the problem.

In fact, Go is a worse-than-average language to call fork(2) in, due to it running fork(2) under a global lock. This is mentioned in the linked article. The lock contention caused by fork(2) execution time as memory consumption increased was what made the process unresponsive.

However, as I also said, whether fork is too expensive depends on the use-case.

paulddraper8y ago

> Each Gitaly server instance was fork/exec'ing Git processes about 20 times per second

> What's really wrong here is that they're apparently spawning processes like crazy.

Sounds like it depends on the use-case, rather then blanket "two dozen processes per second is clearly absurd".

1 more reply

j / k navigate · click thread line to collapse

0 comments

paulddraper8y ago

(1) The benchmark measured the point of discussion.

(2) Even not using asynchronity (which Go is heralded for), processes take <2ms to start and stop. Not nothing, but certainly something you could do hundreds of times a second.

    $ time seq 1000 | while read; do sleep 0; done

    real        0m1.644s
    user        0m1.065s
    sys         0m0.672s

arghwhatOP8y ago

1. No, the point was that fork(2)+exec(3)/spawning processes is an expensive way to run code, not how long it takes for the parent to be able to do something else.

However, as I also said, whether fork is too expensive depends on the use-case.

paulddraper8y ago

> Each Gitaly server instance was fork/exec'ing Git processes about 20 times per second

> What's really wrong here is that they're apparently spawning processes like crazy.

Sounds like it depends on the use-case, rather then blanket "two dozen processes per second is clearly absurd".

1 more reply

j / k navigate · click thread line to collapse