Some remotely exploitable Linux kernel WiFi vulnerabilities (opens in new tab)

(lwn.net)

348 pointsgundamdoubleO3y ago146 comments

146 comments

Can we please stop running network drivers and network stacks in kernel mode by default? It's 2022 and we've got more than enough compute power nowadays that the performance hit for running these in user-land is negligible for most use cases. Smartphone, tablet or laptop users usually do not need the level of performance that requires running that stuff in the kernel when browsing the web.

I get that there are some use cases where performance really matters to the point where kernel network stack and drivers make a difference (high-throughput and/or low-latency services running on servers, high-performance routers...), but that should not be the default for everyone.

dijit3y ago

1) Many will cry about that performance hit (including me).

For over a decade our computers have gotten faster marginally, but our software has gotten slower at a greater rate.

You can barely navigate the web now with a new low end computer (that isn't a Chromebook). Most on this site won't care though because our machines cost $2,000+ and the web is Fine(tm); many folk aren't buying anything over $300 though.

2) These are memory bugs, so the introduction of Rust into the kernel could help us here potentially, no need for an architectural revolution.

boricj3y ago

> Many will cry about that performance hit (including me) because for over a decade our computers have gotten faster marginally, but our software has gotten slower and bloatier at an increasingly rapid pace.

We're talking about network stacks and network drivers, not web browsers. Migrating the network stack from the kernel to a user-land process is not going to measurably slow down web browsers, especially on modern systems with gigabytes of RAM, multiple cores, IOMMUs and whatnots.

> These are memory bugs, so the introduction of Rust into the kernel could help us here potentially, no need for an architectural revolution.

That would require rewriting the network stack and network drivers in Rust (driver code is much more likely to have bugs than the rest of the kernel) for this to be effective, otherwise you'll still have a lot of C code in the network path. I'd argue that this would be a bigger architectural revolution than porting the existing code and running it in user-land. MINIX3 went through such a change when drivers were removed from the kernel (can't find the publication about it right now) and they only required reasonably small changes when porting these to user-land, there were not rewritten from scratch.

But this is not just about memory safety, Rust code can still be vulnerable in many other ways (memory leaks, unsafe blocks, wrong assumptions, incorrect algorithm implementation, buggy/compromised toolchains...). Code running inside the trusted computing base of a system is a liability, enforcing privilege separation and principle of least authority reduces it.

dijit3y ago

> We're talking about network stacks and network drivers, not web browsers.

Ah yes, the magic web-browser that doesn't do any kind of networking at all.

> Migrating the network stack from the kernel to a user-land process is not going to measurably slow down web browsers, especially on modern systems with IOMMUs and whatnots.

I don't know how you can possibly assert that, it's contradicting computer sciences' current understanding of operating system design as it relates to kernelmode/usermode switching, unless you're doing weird shared-memory things in userspace... which is terrifying.

> That would require rewriting the network stack and network drivers in Rust

Not really, C and Rust can interop just fine, you can have network drivers that are rust but the actual networking stack itself can remain C, if you want.

> but this is not just about memory safety, Rust code can still be vulnerable in many other ways

The post is literally memory safety bugs.

3 more replies

throw109203y ago

> I'd argue that this would be a bigger architectural revolution than porting the existing code and running it in user-land. MINIX3 went through such a change

MINIX is a microkernel architecture - running drivers in userspace is one of its core features/selling points, and one that differentiates it from (modular) monolithic kernels such as Linux. So, this isn't a very solid line of reasoning.

It seems to me that the situation is the opposite - that moving drivers to userspace is an architectural change, which is more complex than porting an existing architecture to a new language.

> Rust code can still be vulnerable in many other ways

Sure, but not vulnerable in the way that the vulnerability under discussion is.

> memory leaks

Much harder in Rust than C, and also unlike in C, not going to result in security vulnerabilities.

> buggy/compromised toolchains

If you're going to assume that your toolchain is compromised, than anything is on the table, including the toolchain inserting a backdoor into the kernel and completely bypassing the proposed architectural change of moving drivers into user-space. And, needless to say, compiler bugs are rare in general, and compiler bugs that cause software vulnerabilities are nearly unheard of (and I've literally never seen one before).

Nobody thinking rationally is going to tell you that Rust is going to eliminate all your bugs or make your code secure. However, by far, the majority of security bugs in the Linux kernel are due to mistakes that the design of Rust either completely eliminates or massively reduces.

And security is intrinsically a tradeoff - the Linux kernel is not optimized for maximum security (which would be something formally-verified like seL4), but a compromise between security, performance, and development velocity. The claim is that Rust will provide significantly better security at basically the same performance and possibly modestly improved development velocity - the very least that one should do is rewrite the existing architecture in it (or, again, a language that meets or exceeds the specs of Rust) and then see what the bug rate is before deciding to take a guaranteed performance hit through an architectural change.

1 more reply

loudmax3y ago

A faster network is going to have a marginal effect on software getting slower. There are a lot of factors at play: network speed, cpu power, local caches, etc. The speed of the network driver is one factor among many and I'd be surprised if were the most common bottleneck.

The operators of websites that derive most of their revenue from advertising are going to run their sites at whatever level users will tolerate. Where network drivers are faster they'll either cram in more ad tracking, or won't bother optimizing their existing trackers. If users stop accessing sites because they're too slow to load, operators will either cut down on ad tracking, or more likely, put some effort into optimizing the performance of their ad trackers.

Rust is itself something of an architectural revolution. I believe network drivers in userspace is already a thing, and eBPF may also have a role here. All of this is worth exploring. This is what progress in Linux looks like.

dijit3y ago

> A faster network is going to have a marginal effect on software getting slower.

I don't mean to be glib but: citation needed?

One of the slowest moving hardware improvements (compared to CPU/Memory speeds) is networking.

It's going to take some serious convincing to tell me that we should just fork off performance; when software is already getting slower and slower.

It's not fair to blame advertisers exclusively, we also have electron and the hundreds of JS frameworks, that's before we get down to the low level abstractions that hook into basic programs.

1 more reply

binkHN3y ago

> You can barely navigate the web now with a new low end computer...

You can barely navigate the web now with almost any computer. I have a high-end laptop and opening a few tabs from various sites on the Internet will cause the CPU usage and fan speed to spike. Just for a few tabs! Obviously not all sites do this, but the web has become a framework of advertising monstrosity and I can barely navigate and consume much of today's web content without enabling Reader Mode.

photochemsyn3y ago

Well, you can navigate the web with older low end computers if you're running the full suite of blockers (NoScript & UBlock on Firefox seems to work pretty well), but that means many sites don't work unless you selectively allow certain scripts to work by fiddling with NoScript permissions and selective blocking some elements with Ublock. HTML + CSS sites are no problem, however.

miclill3y ago

Regarding the first point. If I understand correctly you say that there is an inevitable performance hit when not running in kernel mode. But is that really so?

dijit3y ago

I don't like that you were downvoted for curiosity.

Yes, there is a performance hit but how large it is depends a lot on what you're doing.

"Switching" is one of the most costly operations and in Kernel mode you do not need to do it unless interacting with something in user space.. which you would only do because something in Userspace requested it somehow.

For other things, such as virtual memory, Microsoft found that the protections needed for virtual memory could be anywhere between 10 and 20%; but since there's no concept of virtual memory in kernel space: it's hard to say concretely that your program would be "20% faster". It would be too much of a different program.

1 more reply

throwaway8943453y ago

How do the micro kernel folks get around this? Are they paying the syscall toll between kernel components?

yjftsjthsd-h3y ago

> How do the micro kernel folks get around this?

AIUI, in general, they don't. Avoiding the overhead of context switches is the research question in microkernels, and while there is progress being made, last I'd heard there wasn't a solution that didn't carry caveats. Now to be fair, sometimes those caveats are fine - ex. if you only write software in rust maybe you can get away with actually passing around memory without copies - but often they undermine the ability to run arbitrary software on general purpose hardware.

(If I'm behind and this has been solved in the general case, I'm happy to be corrected, but this was my undestanding of things as of a few years ago and I haven't heard about a breakthrough)

dijit3y ago

Yes, exactly!

Redox-OS is based on this approach and is honestly extremely clean and easy to understand (when compared to MINIX which also adopts the same approach).

Wikipedia has a much better explanation than I can give right now: https://en.wikipedia.org/wiki/Microkernel#Performance

mike_hock3y ago

1) For this very reason, it doesn't make sense to optimize low-level components from the OS down to the hardware anymore, and hasn't for over a decade, if an improved user experience is your goal.

The short-term effect may be a slight improvement, but it's a treadmill and the next wave of web crapware will more than nullify it.

If a microkernel architecture with worse performance got established on all mainstream devices, the experience would be worse in the short term but in the medium term, the crapware would have to adapt so that it again becomes just barely usable as it is today.

The problem is that the short-term gains create an incentive for users to buy/use the marginally faster hardware or kernel, which then forces everyone else to follow suit.

amelius3y ago

> Many will cry about that performance hit (including me).

Ok, we could make it optional, i.e. additional security for those who wish it (on top of other things we could do, like the things you mentioned but which aren't a panacea too).

devwastaken3y ago

This is driver software, not relevant. Rust isn't a magic spell to be thrown out there when traditional solutions work quite well. The old reasons for hard kernel/userland separation are far less reasonable now.

fulafel3y ago

Interestingly the user space way is also used in the performance absolutist end of the spectrum, so userspace can talk to the HW without kernel involvement (Snabb, dpdk, etc).

See eg https://talawah.io/blog/linux-kernel-vs-dpdk-http-performanc...

legulere3y ago

For performance you want to reduce context switches between processes including the kernel. With kernel bypass you do everything in one process in the user space, but you are also losing out on features like sharing hardware resources between processes. With general user space drivers you will gain additional context switches. There are also ways to reduce the cost of those context switches though, like IO-uring that Linux recently got.

hansel_der3y ago

indeed, but afaik these more or less depend on dedicating a cpu-thread to the software-thread because a cache eviction would completely wreck the performance

fulafel3y ago

I think with networking appliances that are built from eg DPDK building blocks the main motivation for pinning and isolcpus usage is largely that users want to have a fixed topology of CPU cores, PCIe devices, NUMA memory pools/hugepages. But this doesn't really mean it would be necessary if we started using userspace networking drivers for general purpouse linux workloads.

matheusmoreira3y ago

> Can we please stop running network drivers and network stacks in kernel mode by default?

No. I want the kernel to have as much functionality as possible. I have some zero dependency freestanding software that I boot Linux directly into. I really don't want to have to maintain additional user space in the form of C libraries. If I need to manage wifi connections, I should be able to make some system calls and be done with it without having to link to anything else.

Anything related to hardware belongs in the kernel so that all software can access it via Linux's amazing language agnostic system interface. If there are security problems, then that process should be improved without screwing up the interface by replacing it with user space C libraries. We have enough of that in the graphics stack.

mmis10003y ago

In the last decade, we have bump link speed from 100m 1G to some board even have 2.5G rj45 port now. I am not sure if move data between kernel and userland at 2.5G/s is even a good idea.

And even worse, because kernel still have to distribute data to other userland program, you actually need another round trip so the impact need to multiply by two.

zekica3y ago

Worse, IP package sizes didn't grow at all. So it would require 200 000 context switches per second if you don't want to add code bundling packets kernel-side and dispatching them a-la Nagle's algorithm. This is both error-prone and adds latency. Even worse, trampolines and other mitigations make it so switches clear CPU cache.

legulere3y ago

The trick is to not copy the data, but to pass pages around.

fulafel3y ago

It's the opposite: In the last decade networking connection speeds have plateaued. In the 90s we had 100M ethernet, in the 00's 1G ethernet, in the 2010's 10G ethernet, and it stuck there for a long time. We had much much less CPU cycles available per packet when 1G came, for example.

mmis10003y ago

I don't think I have seen 10G ethernet network card in consumer grade hardware even recently. 10G is there in core infrastructure for a long time. But only until recently you see 2.5Gbps in endpoint devices. (Probably 2.5 makes more sense to a RJ45 head cable)

And also I don't think 10G routers/switchs ever use pure software based solution to handle the traffics. They are all hardware based or mixed solutions.

It's amazing that memory bandwidth / cpu speed / core counts grows so much that makes this even possible. But it still isn't a good idea.

GrabbinD33ze693y ago

I only have a very surface level understanding of linux, how does code running in user mode incur a bit of overhead as opposed to Kernel?

the_duke3y ago

You really need to back up these assertions with some evidence.

Do you have benchmarks that show the impact of switching to userspace on a typical, loaded desktop system with all kinds of workloads? Or are you just guessing?

boricj3y ago

I did not anticipate a Hacker News discussion about a remotely exploitable Linux kernel WiFi vulnerability requiring some network benchmarks on an unusual network stack architecture, but I'll oblige:

This publication (http://www.minix3.org/docs/jorrit-herder/asci06.pdf) claims that MINIX3 could saturate a 1 Gb/s Ethernet link with an user-space network stack, with separate processes for the stack and the driver, on a rusty 32-bit micro-kernel that can't do SMP. In 2006.

nordis83y ago

> It's 2022 and we've got more than enough compute power nowadays to run in user-land

Oh hell no! User-land driver can barely handle WiFi 4 speeds (72Mbps), with terrible CPU performance (1000's of context switches and interrupts per second).

For WiFi 5 (ac), WiFi 6 (ac), you need heavy WiFi firmware involvement, multiple DMA queues and kernel driver, and even with all that it requires special care to reach target performance. There is no chance in hell to reach that kind of performance in user-land.

Manu403y ago

Uhm... no?

1. Date of year has nothing to do with doing things correctly, ever.

2. Ironically, pulling it out of the kernel and running it in user-land will probably bring about more bugs and issues. I would much rather we just fix the problem where it is, and leave it at that, instead of potentially introducing new problems, like backdoors and exploits in the software provided. Not saying it WOULD happen, but the potential for it alone is just not worth the risk in my honest opinion. Let's just fix the right way, and be done with it.

yardstick3y ago

> I get that there are some use cases where performance really matters to the point where kernel network stack and drivers make a difference (high-throughput and/or low-latency services running on servers, high-performance routers...), but that should not be the default for everyone.

I’m on board so long as there is a choice. Routers with crappy hardware need as much help as possible. Also tangentially this is why the current darling of VPN tech, WireGuard, is implemented in kernel not userspace.

chasil3y ago

Good gracious.

"WEINBERG'S SECOND LAW: If builders built buildings the way programmers wrote programs, then the first woodpecker that came along would destroy civilization."

meltyness3y ago

This is more of a disparaging analogy than a law.

alfiedotwtf3y ago

Obligatory comment:

https://www.oreilly.com/openbook/opensources/book/appa.html

boricj3y ago

Note that the Tanenbaum-Torvalds debate was in 1992, over thirty years ago. The security of computer systems might have improved since then, but the fallout of security issues has massively increased. Managing your bank accounts wirelessly on the Internet from anywhere in the world with a thin, battery-powered device that fits inside your pocket was a pipe dream (and no one could've possibly imagined that an Internet-ready smart toaster could compromise it and steal your money on your accounts).

dogleash3y ago

> It's 2022

So someone else should have done it for you by now?

Be the change you want to see in the world.

I'm sure you have an excuse for not doing it personally. Just as I'm sure the person who you've mentally assigned responsibility has at least as good of an excuse too.

boricj3y ago

> Be the change you want to see in the world.

I have made dozens of commits to MINIX3, including a brand-new ISO 9660 file system implementation (https://github.com/Stichting-MINIX-Research-Foundation/minix...).

I have made more than a hundred commits to SerenityOS (https://github.com/SerenityOS/serenity/commits?author=boricj).

Just because I deplore the general state of security in mainstream operating systems doesn't mean that I demand that someone else does something about it for free.

I'm not paid to fix security bugs in the Linux kernel, do you expect me to fix these myself for free just because you want to? No one is entitled to my own free time spent hacking on random stuff.

3143y ago

What kind of latency would this introduce on packet processing? A context switch used to be measured in microseconds. I don't have a linux system here to run lmbench on, and I wonder what the level of latency would be on a modern system.

fulafel3y ago

Current everyday networking has tens-of-microseconds latency[1] and on 2018 era CPUs the context switch overhead seems to be 1-2.5 microseconds[2]. So in benchmark and specialized use cases, context switch overhead would be measurable but many orders of magntude away from relevant when talking about communication over the internet.

[1] https://blog.cloudflare.com/how-to-achieve-low-latency/ - 30-45 microseconds on plain 10G ethernet - faster ethernet probably wouldn't improve much on this

[2] https://eli.thegreenplace.net/2018/measuring-context-switchi...

throw109203y ago

"Drivers are exploitable so we should run them in userspace" is a hack, and not a good one.

The problem is that drivers are exploitable in the first place, so the solution is that we should make them not exploitable (using Rust, or a better language than Rust that fixes some of its problems) and try to preserve our performance that is rapidly being stolen away by bloated userspace software, rather than just shrug our shoulders and say "oh well, I guess that drivers are just intrinsically insecure".

legulere3y ago

It's not a hack. Virtual address spaces and process isolation were built exactly for misbehaving code.

nisa3y ago

Could someone more knowledgeable than me comment if this is as worse as it looks?

As I understood the issues, this will probably lot's of "fun". You can broadcast the pcap files with any monitor mode capable wifi router. Luckily it's 5.1+ so most devices run very old vendor patched kernels and are probably not affected but at least for causing havoc this is really bad. As one issue is using beacon frames just a scan for networks should be enough for a crash. So you can at least crash and maybe exploit any device running recent Linux that scans for wifi networks.

I'm not sure how it's possible to do over the air remote code execution but I guess people are working on this.

eknoes3y ago

I found the vulnerabilities, but am no expert for the Wifi stack.

DoSing is now "easy" as you say, just send those frames and a Linux computer that is currently listening to the network (e.g. scanning for networks) and thus processes the Beacon frames will at least crash. It might be the case that some wifi chips will filter those invalid frames or crash themselves, that depends on the actual hardware / firmware.

The victim must not be connected to a malicious AP or similar, so there is no requirement for tricking a user into something.

RCE is not trivial at all, but due to the nature of the different faults, might be possible. Therefore, see e.g. Mathy Vanhoef who discovered several impressive Wifi vulnerabilities in the past:

https://twitter.com/vanhoefm/status/1580675615992451072

userbinator3y ago

Looks like these are all in mac80211. I'm not 100% familiar with the intimate details of 802.11 but I have read the relevant parts of the standard, at least enough to RE some drivers, and a lot of things were clearly designed to be fixed and of a definite size so as to be implementable on a highly constrained embedded environment, so to see things like use-after-frees appear is a little disappointing.

fsflover3y ago

Fortunately, on Qubes OS, only the networking VM can be exploited like this, and it will be clean again after its reboot.

orblivion3y ago

I installed 4.1 and only my firewall VM is disposable. Wouldn't that mean my net VM could still have an exploit that leaves something in the home directory? (Would be nice if it was easier to trash and rebuild it).

fsflover3y ago

You can choose sys-net to be a disposable during the install. It's not the default. You can also make it a disposable manually: https://www.qubes-os.org/doc/disposable-customization/#using...

Beware that your WiFi password will be forgotten every VM reboot (but there is a workaround on the forums).

orblivion3y ago

Thank you! (Yes I forgot that it was an option I chose; I probably went with the defaults, not knowing the implications of deviating)

Syonyk3y ago

It's possible, but I believe the design is such that sys-net is untrusted, so an exploit there is no more risk than any other use of an unencrypted connection on the network.

But it sure looks like it was a wise idea to spend the resources on isolating network hardware!

lostmsu3y ago

How's GPU support in Qubes these days? (Nvidia, CUDA)

tapper3y ago

FYI Fixes are now in openWrt master 21.x and 22.x branches. New bin files will be posted soon. Or you can build from the git.

cesarb3y ago

From a quick look at the openwrt home page, they had really bad timing this time: they had just released an important security release two days ago (on the 12th), one day before this new set of vulnerabilities was announced yesterday (on the 13th).

kramerger3y ago

Stupid question, but how come this has not been embargoed?

Seems like a pretty major vulnerability that affects tons of devices.

fulafel3y ago

Is there reason to believe it wasnt? The oss-security mail message would seem consistent with the fixes having been prepared under embargo.

BluSyn3y ago

code diff:

https://git.kernel.org/pub/scm/linux/kernel/git/wireless/wir...

londons_explore3y ago

Theres a lot of bugfixes there...

And some are obviously correct... But others would require a lot more understanding of the code to be sure they're correct.

Someone should go through this with a keen eye to check the fixes are actually correct, and aren't just making the fuzzer stop alerting while leaving a more subtle vulnerability open.

UncleMeat3y ago

Yeah the fact that the kernel has changes like this with such minimal testing is the reason why we see regressions in these kinds of bugs all too often.

NickGerleman3y ago

I'm surprised there were not compiler errors for unreachable code, in the cases where the code was returning directly before a goto.

Edit: Looks like GCC removed the warning because it was unreliable. Clang and MSVC seem to be in better shape. https://gcc.gnu.org/legacy-ml/gcc-help/2011-05/msg00360.html

account423y ago

> Edit: Looks like GCC removed the warning because it was unreliable. Clang and MSVC seem to be in better shape. https://gcc.gnu.org/legacy-ml/gcc-help/2011-05/msg00360.html

That is an odd position when -Wstringop-overflow also highly depends on the optimizer (and will frequently generate false positives!) but not only remains in GCC but is enabled by default (even without any -Wall/-Wextra).

Things like this are why its pays to compile your project with as many compilers as possible (as well as static analysis tools).

galangalalgol3y ago

Don't forget asan and ubsan. That requires good unit teat coverage to work though.

sva_3y ago

Seems like most of these got introduced in 5.1/5.2/5.8 and fixed in 5.19.14.

cesarb3y ago

I don't see any of these fixes in 5.19.14; in fact, Fedora has just released a 5.19.15 with these fixes manually applied on top of it. The stable release with these fixes will probably be 5.19.16.

sva_3y ago

It seems like you're right. I misunderstood something. That's worrying.

ArchLinux seems to also have it patched in 6.0.1-arch2-1 though[0].

[0] https://security.archlinux.org/ASA-202210-2/generate

cesarb3y ago

> The stable release with these fixes will probably be 5.19.16.

Updating my own comment (too late to edit), that release is now out with the fixes. The full set of stable releases with these fixes is: 6.0.2, 5.19.16, 5.15.74, 5.10.148, and 5.4.218 (source: https://lwn.net/Articles/911272/).

hotcoffeebear3y ago

I think Fedora 37 beta already use 5.19.15, the stable release in few days afaik.

cesarb3y ago

All currently updated Fedora releases use the same kernel release, that is, every kernel update is released to all currently updated Fedora release at the same time (with slight timing differences for the migration from "testing" to "stable"). If you look at https://bodhi.fedoraproject.org/updates/?packages=kernel right now you see 5.19.15-x01 for Fedora 35, 36, and 37 (the 5.19.15-x00 packages didn't have these fixes, the 5.19.15-x01 packages have them).

derelicta3y ago

guess its gonna be easier than ever to root one's android phone.

jeroenhd3y ago

Only if that phone runs Linux 5.1 or 5.2, obviously most phones will be running Linux 4.14.

cesarb3y ago

As long as the vulnerable changes weren't backported to it (I hope they weren't).

Jon_Lowtek3y ago

> The 6.0.2, 5.19.16, 5.15.74, 5.10.148, and 5.4.218 stable kernel updates have all been released. Among other things, these updates contain the fixes for the recently disclosed WiFi vulnerabilities. ~~ LWN.net

christophilus3y ago

Nice. Just in time for a long weekend on public WiFi with my Linux laptop.

WelcomeShorty3y ago

Much better link:

https://github.com/PurpleVsGreen/beacown

fsflover3y ago

https://news.ycombinator.com/item?id=33201478

wooptoo3y ago

> anybody who uses WiFi on untrusted networks

So is this for public/open Wifi networks only? Or is it for any wireless network where you do not control the gateway?

e12e3y ago

Recommend that people click through and read the comments, in particular the (now) top thread, in part:

https://lwn.net/Articles/911071/

>> anybody who uses WiFi on untrusted networks

> It's actually worse than that - you just have to be scanning (though one of the issues requires P2P functionality to be enabled).

> So basically it's just

>> anybody who uses WiFi

> unfortunately.

And:

> Sorry, it took me longer than expected but I just posted PoCs + logs here: https://www.openwall.com/lists/oss-security/2022/10/13/5

> Most of the vulnerabilities were introduced in 5.1/5.2.

londons_explore3y ago

> > anybody who uses WiFi

It's worse than that - android kernels process beacon frames even if wifi is disabled.

So you should be worried about this if you have an android 11/12 phone, even if you don't use wifi.

Linux desktop/laptop users should be worried if they have wifi enabled, even if not connected to a network.

galangalalgol3y ago

Why 11/12? I have 13 and my kernel is 4.14. They said these got added in 5.1/5.2 right? Android seems to have wildly varying kernels within versions.

2 more replies

dontbenebby3y ago

>It's worse than that - android kernels process beacon frames even if wifi is disabled.

>So you should be worried about this if you have an android 11/12 phone, even if you don't use wifi.

Is this issue (RCE even with wifi off across a huge swathe of devices ) common to many vulnerabilities, and we're just discussing this one because it hit the front page, or is this vulnerability especially... egregious?

1 more reply

smeagull3y ago

> So you should be worried about this if you have an android 11/12 phone, even if you don't use wifi.

Android 12 is 4.14

1 more reply

XMPPwocky3y ago

at least one of the RCE vulns seems to be exploitable even without connecting to any network (reachable via probe response handling).

ByThyGrace3y ago

Hmm does anyone know if there is a site/community/service that keeps track of backports fixing CVEs for different Linux distros?

xani_3y ago

Eh, it didn't get cutesy name like BadWiFi, won't be that bad /s

dspillett3y ago

“beacown” apparently: https://github.com/PurpleVsGreen/beacown

Though that may not be a generally used name as yet.

hardware2win3y ago

Weekly news of memory related CVE.

Keep using unsafe langs.

What will be there in next week? CVE in Chromium?

At this point betting sites should add category for that kind of games.

I do wonder what people of future will think about this:

"So they had research indicating that a lot of issues were related to memory, had technology which significantly reduces this issue, but they still kept doin mess for years?"

https://msrc-blog.microsoft.com/2019/07/22/why-rust-for-safe...

https://microsoftedge.github.io/edgevr/posts/Super-Duper-Sec...

https://www.chromium.org/Home/chromium-security/memory-safet...

Memory issues and JIT (browsers) are two things that are responsible for disgusting amount of security issues

thegeomaster3y ago

This is naive.

You cannot rewrite the entirety of the Linux kernel in another language overnight. You'd have years at least until it becomes production-ready. Not to mention the performance and memory use will be worse.

Certainly the situation can and should be better, but adopting this "it's so easy, how does nobody see it?" attitude helps no one.

UncleMeat3y ago

You obviously cannot rewrite things in another language overnight. But I do wish that the industry saw this as an emergency rather than something ranging from "well, we will get to it when we get to it" to "ugh, I'm tired of these people talking about memory safety - don't you know that you can write correct C programs?"

The linux kernel in particular is perhaps the single most important piece of software on the planet. And we vulns like this all the time. Hundreds per year. And there's billions more lines of C and C++ out there handling all sorts of untrusted input.

The path off C and C++ is complicated as shit. Interop with Rust is messy and there aren't effective tools for automatic translation. Carbon is barely a language at this point (they don't even have a compiler) and doesn't yet provide safety. The story for the other alternative languages isn't any better. But I really wish the industry was throwing billions at this across dozens of major companies and open source organizations.

ff3173y ago

Rust is not a panacea. You can't just claim that this one emergency project somehow solves all the future bugs. Would Linux being rewritten in mostly-Rust help with some classes of memory bugs some of the time? Sure. Would there be a lot of other tradeoffs to consider, are there risks, would there still be plenty of kernel CVEs going forward? Yes to all of these.

2 more replies

surajrmal3y ago

Perhaps it's time to reduce our dependence on the Linux kernel then.

2 more replies

green_on_black3y ago

I mostly agree, but I think I would have read the comment differently. I've seen C++ people have a strong distaste towards Rust for various reasons and don't exactly care too much about the "memory safety" part. Which is... unfortunate. So while it might be "beating a dead horse", the horse isn't even dead.

AshamedCaptain3y ago

In fairness, a shitton of these issues would be solved by following C++ patterns like RAII, instead of the defer/gotos style the kernel seems to be proud of.

kramerger3y ago

This is kernel space, not userspace.

Rusts memory safety is not a 100% protection here.

1 more reply

hardware2win3y ago

You're right.

People need to be aware how mem. safety affects security in critical software like Chromium or Microsoft's.

e2le3y ago

Perhaps instead we could train an ML model to find these use-after-free bugs?

hardware2win3y ago

>Not to mention the performance and memory use will be worse.

How much?

Karellen3y ago

Well, also posted on LWN today for subscribers: A first look at Rust in the 6.1 kernel: https://lwn.net/Articles/910762/

(If you don't have a subscription, the article will become freely available to everyone on Oct 27th.)

tl;dr, it doesn't do anything interesting yet, but the infrastructure is getting there, and starting the process of evolving the kernel to using a safe language.

asddubs3y ago

i can understand the argument for new code, but what do you want people to do, recode the entirety of the linux kernel in rust? the kernel is allowing rust for new drivers

hardware2win3y ago

Faster adoption

Raise awarness

hsbauauvhabzb3y ago

Easy to sit on the sidelines and tell everyone else to do better, isn’t it?

1 more reply

j / k navigate · click thread line to collapse

146 comments

boricj3y ago

dijit3y ago

1) Many will cry about that performance hit (including me).

For over a decade our computers have gotten faster marginally, but our software has gotten slower at a greater rate.

2) These are memory bugs, so the introduction of Rust into the kernel could help us here potentially, no need for an architectural revolution.

boricj3y ago

> These are memory bugs, so the introduction of Rust into the kernel could help us here potentially, no need for an architectural revolution.

dijit3y ago

> We're talking about network stacks and network drivers, not web browsers.

Ah yes, the magic web-browser that doesn't do any kind of networking at all.

> Migrating the network stack from the kernel to a user-land process is not going to measurably slow down web browsers, especially on modern systems with IOMMUs and whatnots.

> That would require rewriting the network stack and network drivers in Rust

Not really, C and Rust can interop just fine, you can have network drivers that are rust but the actual networking stack itself can remain C, if you want.

> but this is not just about memory safety, Rust code can still be vulnerable in many other ways

The post is literally memory safety bugs.

3 more replies

throw109203y ago

> I'd argue that this would be a bigger architectural revolution than porting the existing code and running it in user-land. MINIX3 went through such a change

It seems to me that the situation is the opposite - that moving drivers to userspace is an architectural change, which is more complex than porting an existing architecture to a new language.

> Rust code can still be vulnerable in many other ways

Sure, but not vulnerable in the way that the vulnerability under discussion is.

> memory leaks

Much harder in Rust than C, and also unlike in C, not going to result in security vulnerabilities.

> buggy/compromised toolchains

1 more reply

loudmax3y ago

dijit3y ago

> A faster network is going to have a marginal effect on software getting slower.

I don't mean to be glib but: citation needed?

One of the slowest moving hardware improvements (compared to CPU/Memory speeds) is networking.

It's going to take some serious convincing to tell me that we should just fork off performance; when software is already getting slower and slower.

It's not fair to blame advertisers exclusively, we also have electron and the hundreds of JS frameworks, that's before we get down to the low level abstractions that hook into basic programs.

1 more reply

binkHN3y ago

> You can barely navigate the web now with a new low end computer...

photochemsyn3y ago

miclill3y ago

Regarding the first point. If I understand correctly you say that there is an inevitable performance hit when not running in kernel mode. But is that really so?

dijit3y ago

I don't like that you were downvoted for curiosity.

Yes, there is a performance hit but how large it is depends a lot on what you're doing.

1 more reply

throwaway8943453y ago

How do the micro kernel folks get around this? Are they paying the syscall toll between kernel components?

yjftsjthsd-h3y ago

> How do the micro kernel folks get around this?

(If I'm behind and this has been solved in the general case, I'm happy to be corrected, but this was my undestanding of things as of a few years ago and I haven't heard about a breakthrough)

dijit3y ago

Yes, exactly!

Redox-OS is based on this approach and is honestly extremely clean and easy to understand (when compared to MINIX which also adopts the same approach).

Wikipedia has a much better explanation than I can give right now: https://en.wikipedia.org/wiki/Microkernel#Performance

mike_hock3y ago

1) For this very reason, it doesn't make sense to optimize low-level components from the OS down to the hardware anymore, and hasn't for over a decade, if an improved user experience is your goal.

The short-term effect may be a slight improvement, but it's a treadmill and the next wave of web crapware will more than nullify it.

The problem is that the short-term gains create an incentive for users to buy/use the marginally faster hardware or kernel, which then forces everyone else to follow suit.

amelius3y ago

> Many will cry about that performance hit (including me).

Ok, we could make it optional, i.e. additional security for those who wish it (on top of other things we could do, like the things you mentioned but which aren't a panacea too).

devwastaken3y ago

fulafel3y ago

Interestingly the user space way is also used in the performance absolutist end of the spectrum, so userspace can talk to the HW without kernel involvement (Snabb, dpdk, etc).

See eg https://talawah.io/blog/linux-kernel-vs-dpdk-http-performanc...

legulere3y ago

hansel_der3y ago

indeed, but afaik these more or less depend on dedicating a cpu-thread to the software-thread because a cache eviction would completely wreck the performance

fulafel3y ago

matheusmoreira3y ago

> Can we please stop running network drivers and network stacks in kernel mode by default?

mmis10003y ago

In the last decade, we have bump link speed from 100m 1G to some board even have 2.5G rj45 port now. I am not sure if move data between kernel and userland at 2.5G/s is even a good idea.

And even worse, because kernel still have to distribute data to other userland program, you actually need another round trip so the impact need to multiply by two.

zekica3y ago

legulere3y ago

The trick is to not copy the data, but to pass pages around.

fulafel3y ago

mmis10003y ago

And also I don't think 10G routers/switchs ever use pure software based solution to handle the traffics. They are all hardware based or mixed solutions.

It's amazing that memory bandwidth / cpu speed / core counts grows so much that makes this even possible. But it still isn't a good idea.

GrabbinD33ze693y ago

I only have a very surface level understanding of linux, how does code running in user mode incur a bit of overhead as opposed to Kernel?

the_duke3y ago

You really need to back up these assertions with some evidence.

Do you have benchmarks that show the impact of switching to userspace on a typical, loaded desktop system with all kinds of workloads? Or are you just guessing?

boricj3y ago

I did not anticipate a Hacker News discussion about a remotely exploitable Linux kernel WiFi vulnerability requiring some network benchmarks on an unusual network stack architecture, but I'll oblige:

nordis83y ago

> It's 2022 and we've got more than enough compute power nowadays to run in user-land

Oh hell no! User-land driver can barely handle WiFi 4 speeds (72Mbps), with terrible CPU performance (1000's of context switches and interrupts per second).

Manu403y ago

Uhm... no?

1. Date of year has nothing to do with doing things correctly, ever.

yardstick3y ago

chasil3y ago

Good gracious.

"WEINBERG'S SECOND LAW: If builders built buildings the way programmers wrote programs, then the first woodpecker that came along would destroy civilization."

meltyness3y ago

This is more of a disparaging analogy than a law.

alfiedotwtf3y ago

Obligatory comment:

https://www.oreilly.com/openbook/opensources/book/appa.html

boricj3y ago

dogleash3y ago

> It's 2022

So someone else should have done it for you by now?

Be the change you want to see in the world.

I'm sure you have an excuse for not doing it personally. Just as I'm sure the person who you've mentally assigned responsibility has at least as good of an excuse too.

boricj3y ago

> Be the change you want to see in the world.

I have made dozens of commits to MINIX3, including a brand-new ISO 9660 file system implementation (https://github.com/Stichting-MINIX-Research-Foundation/minix...).

I have made more than a hundred commits to SerenityOS (https://github.com/SerenityOS/serenity/commits?author=boricj).

Just because I deplore the general state of security in mainstream operating systems doesn't mean that I demand that someone else does something about it for free.

I'm not paid to fix security bugs in the Linux kernel, do you expect me to fix these myself for free just because you want to? No one is entitled to my own free time spent hacking on random stuff.

3143y ago

fulafel3y ago

[1] https://blog.cloudflare.com/how-to-achieve-low-latency/ - 30-45 microseconds on plain 10G ethernet - faster ethernet probably wouldn't improve much on this

[2] https://eli.thegreenplace.net/2018/measuring-context-switchi...

throw109203y ago

"Drivers are exploitable so we should run them in userspace" is a hack, and not a good one.

legulere3y ago

It's not a hack. Virtual address spaces and process isolation were built exactly for misbehaving code.

nisa3y ago

Could someone more knowledgeable than me comment if this is as worse as it looks?

I'm not sure how it's possible to do over the air remote code execution but I guess people are working on this.

eknoes3y ago

I found the vulnerabilities, but am no expert for the Wifi stack.

The victim must not be connected to a malicious AP or similar, so there is no requirement for tricking a user into something.

RCE is not trivial at all, but due to the nature of the different faults, might be possible. Therefore, see e.g. Mathy Vanhoef who discovered several impressive Wifi vulnerabilities in the past:

https://twitter.com/vanhoefm/status/1580675615992451072

userbinator3y ago

fsflover3y ago

Fortunately, on Qubes OS, only the networking VM can be exploited like this, and it will be clean again after its reboot.

orblivion3y ago

fsflover3y ago

You can choose sys-net to be a disposable during the install. It's not the default. You can also make it a disposable manually: https://www.qubes-os.org/doc/disposable-customization/#using...

Beware that your WiFi password will be forgotten every VM reboot (but there is a workaround on the forums).

orblivion3y ago

Thank you! (Yes I forgot that it was an option I chose; I probably went with the defaults, not knowing the implications of deviating)

Syonyk3y ago

It's possible, but I believe the design is such that sys-net is untrusted, so an exploit there is no more risk than any other use of an unencrypted connection on the network.

But it sure looks like it was a wise idea to spend the resources on isolating network hardware!

lostmsu3y ago

How's GPU support in Qubes these days? (Nvidia, CUDA)

tapper3y ago

FYI Fixes are now in openWrt master 21.x and 22.x branches. New bin files will be posted soon. Or you can build from the git.

cesarb3y ago

kramerger3y ago

Stupid question, but how come this has not been embargoed?

Seems like a pretty major vulnerability that affects tons of devices.

fulafel3y ago

Is there reason to believe it wasnt? The oss-security mail message would seem consistent with the fixes having been prepared under embargo.

BluSyn3y ago

code diff:

https://git.kernel.org/pub/scm/linux/kernel/git/wireless/wir...

londons_explore3y ago

Theres a lot of bugfixes there...

And some are obviously correct... But others would require a lot more understanding of the code to be sure they're correct.

Someone should go through this with a keen eye to check the fixes are actually correct, and aren't just making the fuzzer stop alerting while leaving a more subtle vulnerability open.

UncleMeat3y ago

Yeah the fact that the kernel has changes like this with such minimal testing is the reason why we see regressions in these kinds of bugs all too often.

NickGerleman3y ago

I'm surprised there were not compiler errors for unreachable code, in the cases where the code was returning directly before a goto.

Edit: Looks like GCC removed the warning because it was unreliable. Clang and MSVC seem to be in better shape. https://gcc.gnu.org/legacy-ml/gcc-help/2011-05/msg00360.html

account423y ago

> Edit: Looks like GCC removed the warning because it was unreliable. Clang and MSVC seem to be in better shape. https://gcc.gnu.org/legacy-ml/gcc-help/2011-05/msg00360.html

Things like this are why its pays to compile your project with as many compilers as possible (as well as static analysis tools).

galangalalgol3y ago

Don't forget asan and ubsan. That requires good unit teat coverage to work though.

sva_3y ago

Seems like most of these got introduced in 5.1/5.2/5.8 and fixed in 5.19.14.

cesarb3y ago

I don't see any of these fixes in 5.19.14; in fact, Fedora has just released a 5.19.15 with these fixes manually applied on top of it. The stable release with these fixes will probably be 5.19.16.

sva_3y ago

It seems like you're right. I misunderstood something. That's worrying.

ArchLinux seems to also have it patched in 6.0.1-arch2-1 though[0].

[0] https://security.archlinux.org/ASA-202210-2/generate

cesarb3y ago

> The stable release with these fixes will probably be 5.19.16.

hotcoffeebear3y ago

I think Fedora 37 beta already use 5.19.15, the stable release in few days afaik.

cesarb3y ago

derelicta3y ago

guess its gonna be easier than ever to root one's android phone.

jeroenhd3y ago

Only if that phone runs Linux 5.1 or 5.2, obviously most phones will be running Linux 4.14.

cesarb3y ago

As long as the vulnerable changes weren't backported to it (I hope they weren't).

Jon_Lowtek3y ago

christophilus3y ago

Nice. Just in time for a long weekend on public WiFi with my Linux laptop.

WelcomeShorty3y ago

Much better link:

https://github.com/PurpleVsGreen/beacown

fsflover3y ago

https://news.ycombinator.com/item?id=33201478

wooptoo3y ago

> anybody who uses WiFi on untrusted networks

So is this for public/open Wifi networks only? Or is it for any wireless network where you do not control the gateway?

e12e3y ago

Recommend that people click through and read the comments, in particular the (now) top thread, in part:

https://lwn.net/Articles/911071/

>> anybody who uses WiFi on untrusted networks

> It's actually worse than that - you just have to be scanning (though one of the issues requires P2P functionality to be enabled).

> So basically it's just

>> anybody who uses WiFi

> unfortunately.

And:

> Sorry, it took me longer than expected but I just posted PoCs + logs here: https://www.openwall.com/lists/oss-security/2022/10/13/5

> Most of the vulnerabilities were introduced in 5.1/5.2.

londons_explore3y ago

> > anybody who uses WiFi

It's worse than that - android kernels process beacon frames even if wifi is disabled.

So you should be worried about this if you have an android 11/12 phone, even if you don't use wifi.

Linux desktop/laptop users should be worried if they have wifi enabled, even if not connected to a network.

galangalalgol3y ago

Why 11/12? I have 13 and my kernel is 4.14. They said these got added in 5.1/5.2 right? Android seems to have wildly varying kernels within versions.

2 more replies

dontbenebby3y ago

>It's worse than that - android kernels process beacon frames even if wifi is disabled.

>So you should be worried about this if you have an android 11/12 phone, even if you don't use wifi.

1 more reply

smeagull3y ago

> So you should be worried about this if you have an android 11/12 phone, even if you don't use wifi.

Android 12 is 4.14

1 more reply

XMPPwocky3y ago

at least one of the RCE vulns seems to be exploitable even without connecting to any network (reachable via probe response handling).

ByThyGrace3y ago

Hmm does anyone know if there is a site/community/service that keeps track of backports fixing CVEs for different Linux distros?

xani_3y ago

Eh, it didn't get cutesy name like BadWiFi, won't be that bad /s

dspillett3y ago

“beacown” apparently: https://github.com/PurpleVsGreen/beacown

Though that may not be a generally used name as yet.

hardware2win3y ago

Weekly news of memory related CVE.

Keep using unsafe langs.

What will be there in next week? CVE in Chromium?

At this point betting sites should add category for that kind of games.

I do wonder what people of future will think about this:

"So they had research indicating that a lot of issues were related to memory, had technology which significantly reduces this issue, but they still kept doin mess for years?"

https://msrc-blog.microsoft.com/2019/07/22/why-rust-for-safe...

https://microsoftedge.github.io/edgevr/posts/Super-Duper-Sec...

https://www.chromium.org/Home/chromium-security/memory-safet...

Memory issues and JIT (browsers) are two things that are responsible for disgusting amount of security issues

thegeomaster3y ago

This is naive.

Certainly the situation can and should be better, but adopting this "it's so easy, how does nobody see it?" attitude helps no one.

UncleMeat3y ago

ff3173y ago

2 more replies

surajrmal3y ago

Perhaps it's time to reduce our dependence on the Linux kernel then.

2 more replies

green_on_black3y ago

AshamedCaptain3y ago

In fairness, a shitton of these issues would be solved by following C++ patterns like RAII, instead of the defer/gotos style the kernel seems to be proud of.

kramerger3y ago

This is kernel space, not userspace.

Rusts memory safety is not a 100% protection here.

1 more reply

hardware2win3y ago

You're right.

People need to be aware how mem. safety affects security in critical software like Chromium or Microsoft's.

e2le3y ago

Perhaps instead we could train an ML model to find these use-after-free bugs?

hardware2win3y ago

>Not to mention the performance and memory use will be worse.

How much?

Karellen3y ago

Well, also posted on LWN today for subscribers: A first look at Rust in the 6.1 kernel: https://lwn.net/Articles/910762/

(If you don't have a subscription, the article will become freely available to everyone on Oct 27th.)

tl;dr, it doesn't do anything interesting yet, but the infrastructure is getting there, and starting the process of evolving the kernel to using a safe language.

asddubs3y ago

i can understand the argument for new code, but what do you want people to do, recode the entirety of the linux kernel in rust? the kernel is allowing rust for new drivers

hardware2win3y ago

Faster adoption

Raise awarness

hsbauauvhabzb3y ago

Easy to sit on the sidelines and tell everyone else to do better, isn’t it?

1 more reply

j / k navigate · click thread line to collapse