This suggests a long-term compromise solution where threads within a process can use hyperthreading to share a core, but threads in different processes can't. Given that hyperthreads share L1 cache, this might also be better for performance.
Intuitively this may sound logical, however in practice it's often not the case. For many workloads putting two threads of the same program on a core ends up being worse than co-locating with threads from different programs. The reason is that two threads of the same program will often end up executing similar instruction streams (a really good example is when both are using vector instructions (these registers are shared between the two hyperthreads)).
SMT/hyperthreading is complicated. If you have a workload dominated by non-local DRAM fetches, it's a huge win because when the CPU pipeline is stalled on one thread it can still issue instructions from the other.
If you have a workload dominated by L1 cache bandwidth, the opposite is true because the threads compete for the same resource.
On balance, on typical workloads, it's a win. But there are real-world problems for which turning it off is a legitimate performance choice.
Why is that bad?
Browsers are particularly problematic, and it would be nice to alert the scheduler that a particular process is untrusted and extra care should be taken to sanitize caches before and after its time slice.
If userspace thread writes something into a buffer, does some syscall initiating asynchronous work in the kernel wouldn't it be better for the kernel thread to be located on the same core instead of shuffling the data into another cache?
I'm not an OpenBSD user (and glad for it, if this is anything to go by), but I'm curious - is this really how they operate, or does this decision stand out?
I'm not a OpenBSD user either, I use FreeBSD whenever possible. However from listening to OpenBSD devs, via blogs, conferences, HN, etc, it seems that OpenBSD is an operating system built mainly for OpenBSD developers, their goals support this[1]. OpenBSD being useful for non OpenBSD developers is more of a secondary goal compared to how FreeBSD or Linux or any other OS handles it. Also OpenBSD is much more of a research operating system then other large successful OS(Linux, Windows, MacOS, FreeBSD, etc). Meaning OpenBSD cares way more about developing features and novel security mitigations then trying to maintain backwards compatibility like other operating systems do.
> So... they "strongly suspect" (but don't know and haven't shown) there may be a Spectre-class bug enabled by current HT implementations and improving their scheduler is hard, so they'll pre-emptively disable HT outright on Intel CPUs now and others in the near future?
The OpenBSD devs strongly suspected another Intel hardware bug a week or two ago, implemented a mitigation and deployed it. Turns out they were right[2].
[1]: https://www.openbsd.org/goals.html
[2]: https://www.bleepingcomputer.com/news/security/new-lazy-fp-s...
This is not the feeling I get from OpenBSD at all. They don't act like research. They aren't keen on implementing new features just for the sake of it, or just to try it out. A better description would be that they put correctness, security and maintainability first, and simplicity often comes as a nice side effect. Deprecating old, unused features is just a consequence of striving to decrease complexity by trimming your code base. OpenBSD is one of the few OS where the number of lines of code is not skyrocketing to unmanageable numbers.
Honestly, I would say that this is true of many open source projects. It's one of the reasons that open source development tools are so good on Linux, but end user applications fall so far behind. It's also why documentation and usability tend to be much worse. When your system is based on volunteering, the work that gets done tends to be the stuff that interests the workers.
I don't see it that way at all. Whenever I have to work on a project where security is top concern, I always look at OpenBSD as an option. In the Linux world, the equivalent would be the Openwall GNU/*/Linux project. Not something for an average user, but to say it's used by its devs mainly is off by an order of magnitude.
In fairness, my impression from the video of Theo's presentation was that they were tipped off by someone under embargo.
OpenBSD is a research operating system and security is a core component of their research. Pro-actively mitigating security risks before exploits appear is one way to improve security that has worked in the past: vulnerabilities having been fixed before they appeared.
Because they give reasonable deadlines for companies to fix security bugs (~90 days), they are kept out of the loop by hardware vendors like Intel who requested 1 year to fix meltdown.
Being in the dark, if you see some suspicious behavior, either you protect yourself from it, or you might wake up the next day and Intel will have released a new "We are sorry" post and your users would be screwed.
So this is pretty much how they operate, and if an OS is as security conscious as OpenBSD, there isn't really a different way to operate.
Note that disabling hyper threading to mitigate CPU flaws isn't anything new either: this had to be done for AMD's Ryzen because of hardware bugs last year anyways - https://www.extremetech.com/computing/254750-amd-replaces-ry...
That's really what I was getting at with my question. There's no such thing as absolute security, it is a set of tradeoffs between usability, performance and specific security guarantees. Is there a point where the OpenBSD developers would say "okay, this is a (potential or confirmed) security bug, but the mitigation is just too costly in this case"?
In the post-Spectre world, it's not completely inconceivable to contemplate the possibility that, in order to retain the security guarantees most people thought they had, one might have to give up a substantial subset of the benefits of speculative execution in out of order processors. For some workloads that might mean up to two orders of magnitude in performance. I know roughly where the common operating systems would draw the line and I certainly know where I would, for my own usecases. I'm just curious about how OpenBSD works in this regard.
Is 90 days really a reasonable timeframe to fix something like meltdown? I agree with your whole comment in general, but hardware/microcode issues at Intel's scale are a different beast than some buffer overflow.
.. yet most likely if interacting with unix systems rely on OpenSSH.
why would relying on a feature from a vendor with known processor security issues, including undisclosed hidden application processors, for a feature which has marginal performance improvement and in some cases degredation be a preferable stance?
at best ambivalence towards this decision would be the position to take, esp. given the very recent 'oh hey fpu registers are also a problem' "discovery" which they were entirely correct about..
Some security professionals seem to insist on having a proven exploit before they act. Doesn't that seem like poor decision-making? Their job is to provide security, not to secure proven exploits - the latter is a means to an end. If there are threats from unknown exploits, and there certainly are, then it seems that they need techniques to address unknown exploits. One of those techniques is expert analysis of potential threats.
CPU bugs seem to be a rich vein to mine at the moment.
Spectre is about a) leaving side-effects of misspeculation in shared resources, and b) bandwidth contention (between a misspeculated instruction stream and an attacker) to shared resources.
It is trivially obvious that HT exacerbates Spectre-class bugs, as the entire raison d'être to HT is to share pipeline resources. How quickly information can be leaked can be up for debate, but it's definitely non-zero.
That would describe it if they...disabled it outright.
But they made HT user configurable, just like any other performance tuning knob.
Also seen Erlang workloads where you could get a bit of throughput increase with your VM scheduler scheduling more threads than your physical cores (so starting to use HT) but the latency would spike and become very unpredictable, which was a bad tradeoff for the use case.
When you have four-six-eight or more cores, there's less value in doubling that number. The gain is lower.
I don't have hard numbers to back this up, it's purely my personal experience/recollection. On my 2 socket P4 Xeon box, I disabled HT. On my current I7 6-core box, I have HT on.
"We really should not run different security domains on different processor threads of the same core. Unfortunately changing our scheduler to take this into account is far from trivial."
https://cvsweb.openbsd.org/cgi-bin/cvsweb/
https://cvsweb.openbsd.org/cgi-bin/cvsweb/src/sys/arch/amd64...
I've encountered some cases where SMT made performance worse such as with very optimized HPC libs but in general SMT can really help. Compiling projects got a nice boost when enabling HT on Intel's recent arch for example (all of this on Linux though, last time I checked OpenBSD its SMP perf was abysmal)
Many OpenBSD devs are security researchers in academia. If they hear whisphers over beers that there are new Spectre attacks coming that exploit this or that, they might not be able to reproduce the exploit without putting a lot of work into it (it's research after all), but they might be able to prevent it by making a simple change, like disabling hyperthreading.
OpenBSD cares more about security than basically any other trade-off in OS design (performance, usability, ...), so it makes sense to me that they went this way. If you want a balance of security and performance, OpenBSD is not for you any ways.
For a system aiming at security, it's a completely valid choice to disable things that start to look questionable, even if it's not conclusively proven yet. Just like potential software vulnerabilities are patched even if nobody has demonstrated that they actually are exploitable yet.
OP (and environs) has names on it that I have seen before and respect as knowing what the hell they are on about.