undefined | Better HN

0 pointsls655363y ago0 comments

In general this makes sense, but I think you need to be careful in some cases where the lowest latency between two logical "cores" is likely to be between those which are SMT siblings on the same physical core (assuming you have an SMT-enabled system). These logical "cores" will be sharing much of the same physical core's resources (such as the low-latency L1/L2 and micro-op caches), so depending on the particular workload, pinning two threads to these two logical "cores" could very well result in worse performance overall.

0 comments

slabity3y ago

SMT is usually disabled in these situations to prevent it from being a concern.

nextaccountic3y ago

Doesn't this leave some performance on the table? Each core has more ports than a single thread could reasonably use, exactly because two threads can run on a single core

slabity3y ago

In terms of throughput, technically yes, you are leaving performance on the table. However, in HFT the throughput is greatly limited by IO anyways, so you don't get much benefit with it enabled.

What you want is to minimize latency, which means you don't want to be waiting for anything before you start processing whatever information you need. To do this, you need to ensure that the correct things are cached where they need to be, and SMT means that you have multiple threads fighting each other for that precious cache space.

In non-FPGA systems I've worked with, I've seen dozens of microseconds of latency added with SMT enabled vs disabled.

jnordwick3y ago

Maybe 10 years ago that as the common things, but there are so many exta resources (esp registers) that is is now giving up almost half the chip. If you can be cache friendly enough, the extra cycles will make up for it.

slabity3y ago

No, this is not true at all. "The extra cycles" is the exact thing you want to avoid in HFT. It doesn't matter how much throughput of processing you can put through a single core if you enable SMT, because somewhere in the path (either broker, exchange, or some switch in between) you will eventually be limited in throughput that it becomes irrelevant.

The only thing that matters at that point is latency, and unless you are cache-friendly enough to store your entire program in a single core's cache twice over, you would be better off disabling SMT altogether. And even if you were able to do that, it would not matter as a single thread would be done processing a message by the time the next one comes in. At least at the currently standard 10-25Gbps that the exchanges can handle.

In HFT, we're fine giving up half the registers in a core if it means we get an extra few microseconds of latency back.

bitcharmer3y ago

No one in hft space runs with smt enabled

j / k navigate · click thread line to collapse

0 comments

slabity3y ago

SMT is usually disabled in these situations to prevent it from being a concern.

nextaccountic3y ago

Doesn't this leave some performance on the table? Each core has more ports than a single thread could reasonably use, exactly because two threads can run on a single core

slabity3y ago

In terms of throughput, technically yes, you are leaving performance on the table. However, in HFT the throughput is greatly limited by IO anyways, so you don't get much benefit with it enabled.

In non-FPGA systems I've worked with, I've seen dozens of microseconds of latency added with SMT enabled vs disabled.

jnordwick3y ago

slabity3y ago

In HFT, we're fine giving up half the registers in a core if it means we get an extra few microseconds of latency back.

bitcharmer3y ago

No one in hft space runs with smt enabled

j / k navigate · click thread line to collapse