Swift has (had?) the same issue and I had to write a program to illustrate that Swift is (was?) perfectly happy to segfault under shared access to data structures.
Go has never been memory-safe (in the Rust and Java sense) and it's wild to me that it got branded as such.
This is just two groups of people talking past each other.
It's not as if Go programmers are unaware of the distinction you're talking about. It's literally the premise of the language; it's the basis for "share by communicating, don't communicate by sharing". Obviously, that didn't work out, and modern Go does a lot of sharing and needs a lot of synchronization. But: everybody understands that.
> the issue here is that the "Rust and Java sense" of memory safety is not the actual meaning of the term
So what is the actual meaning? Is it simply "there are no cases of actual exploited bugs in the wild"?
Because in another comment you wrote:
> a term of art was created to describe something complicated; in this case, "memory safety", to describe the property of programming languages that don't admit to memory corruption vulnerabilities, such as stack and heap overflows, use-after-frees, and type confusions. Later, people uninvolved with the popularization of the term took the term and tried to define it from first principles, arriving at a place different than the term of art.
But type confusion is exactly what has been demonstrated in the post's example. So what kind of memory safety does Go actually provide, in the term of art sense?
It's been in usage for PLT for at least twenty years[1]. You are at least two decades late to the party.
Software is memory-safe if (a) it never references a memory location outside the address space allocated by or that entity, and (b) it never executes intstruction outside code area created by the compiler and linker within that address space.
[1]https://llvm.org/pubs/2003-05-05-LCTES03-CodeSafety.pdfEverybody does not understand that otherwise there would be zero of these issues in shipping code.
This is the problem with the C++ crowd hoping to save their language. Maybe they'll finally figure out some --disallow-all-ub-and-be-memory-safe-and-thread-safe flag but at the moment it's still insanely trivial to make a mistake and return a reference to some value on the stack or any number of other issues.
The answer can not be "just write flawless code and you'll never have these issues" but at the moment that's all C++, and Go, from this article has.
I had to convince Go people that you can segfault with Go. Or you mean the language designers with using everybody?
Hence the focus on fearless concurrency or other small-scale idioms like match in an attempt to present Rust as an overall better language compared to other safe languages like Go, which is proving to be a solid competitor and is much easier to learn and understand.
~130k LoC Swift app was converted from 5 -> 6 for us in about 3 days.
Swift is strating to look more like old java beans. (if you are old enough to remember this, most swift developers are too young). Doing some of the same mistakes.
Anways https://forums.swift.org/t/has-swifts-concurrency-model-gone... Common problems all devs face: https://www.massicotte.org/problematic-patterns
Anyways, they are trying to reinvent 'safe concurrency' while almost throwing the baby with the bathwater, and making swift even more complex and harder to get into.
There is ways to go. For simple apps, the new concurrency is easy to adopt. But for anything that is less than trivial, it becomes a lot of work, to the point that it might not make it worth it.
Rust avoids all this entirely, by using its type system.
Rust on the other hand solves that. There is code you can't write easily in Rust, but just yesterday I took a rust iteration, changed 'iter()' to 'par_iter()', and given it compiled I had high confidence it was going to work (which it did).
Not synchronizing writes on most data structure does not create a SEGFAULT, you have to be in a very specific condition to create one, those conditions are extremely rares and un-usual ( from the programmer perspective).
In OP blog to triggers one he's doing one of those condition in an infinite loop.
Or put another way what is the likelihood that a go program is memory unsafe?
Not going down the same road is the only reason it didn't end up on the pile of obscure languages nobody uses.
That said in many years of using Go in production I don't think I've ever come across a situation where the exact requirements to cause this bug have occurred.
Uber has talked a lot about bugs in Go code. This article is useful to understand some of the practical problems facing Go developers actually wind up being, particularly the table at the bottom summarizing how common each issue is.
https://www.uber.com/en-US/blog/data-race-patterns-in-go/
They don't have a specific category that would cover this issue, because most of the time concurrent map or slice accesses are on the same slice and this needs you to exhibit a torn read.
So why doesn't it come up more in practice? I dunno. Honestly beats me. I guess people are paranoid enough to avoid this particular pitfall most of the time, kind of like the Technology Connections theory on Americans and extension cords/powerstrips[1]. Re-assigning variables that are known to be used concurrently is obvious enough to be a problem and the language has atomics, channels, mutex locks so I think most people just don't wind up doing that in a concurrent context (or at least certainly not on purpose.) The race detector will definitely find it.
For some performance hit, though, the torn reads problem could just be fixed. I think they should probably do it, but I'm not losing sweat over all of the Go code in production. It hasn't really been a big issue.
It ultimately resulted in a loop counter overflowing, which recomputed the same thing a billion of time (but always the same!). So the visible effect was a request would randomly take 3 min instead of 100ms.
I ended up using perf in production, which indirectly lead me to understand the data race.
I was called in to help the team because of my experience debugging the weirdest things as a platform dev.
Because of this I was exposed to so many races in Go, from my biased point of view, I want Rust everywhere instead.
But I guess I am putting myself out of a job? ;)
People talk a lot about the productivity gains of ai, but fixing problems like this at the language level could have an even bigger impact on productivity, but are far less sensational. Think about how much productivity is lost due to obscure but detectable bugs like this one. I don't think rust is a good answer (it doesn't check overflow by default), but at least it points a little bit in the vaguely correct direction.
Go is really good at easy concurrency tasks, like things that have almost no shared memory at all, "shared-nothing" architectures, like a typical web server. Share some resources like database handles with a sync.Pool and call it a day. Go lets you write "async" code as if it were sync with no function coloring, making it decidedly nicer than basically anything in its performance class for this use case.
Rust, on the other hand, has to contend with function coloring and a myriad of seriously hard engineering tasks to deal with async issues. Async Rust gets better every year, but personally I still (as of last month at least) think it's quite a mess. Rust is absolutely excellent for traditional concurrency, though. Anything where you would've used a mutex lock, Rust is just way better than everything else. It's beautiful.
But I struggle to be as productive in Rust as I am in Go, because Rust, the standard library, and its ecosystem gives the programmer so much to worry about. It sometimes reminds me of C++ in that regard, though it's nowhere near as extremely bad (because at least there's a coherent build system and package manager.) And frankly, a lot of software I write is just boring, and Go does fine for a lot of that. I try Rust periodically for things, and romantically it feels like it's the closest language to "the future", but I think the future might still have a place for languages like Go.
This means that multiple goroutines were writing to the same local variable. I've never worked on a Go team where code that is structured in such a way would be considered normal or pass code review without good justification.
But I think terms like "memory safety" should have a reasonably strict meaning, and languages that go the extra mile of actually preventing memory corruption even in concurrent programs (which is basically everything typically considered "memory safe" except Go) should not be put into the same bucket as languages that decide not to go through this hassle.
We had a rule at my last gig: avoid anonymous functions and always recover from them.
What's happening here, as happens so often in other situations, is that a term of art was created to describe something complicated; in this case, "memory safety", to describe the property of programming languages that don't admit to memory corruption vulnerabilities, such as stack and heap overflows, use-after-frees, and type confusions. Later, people uninvolved with the popularization of the term took the term and tried to define it from first principles, arriving at a place different than the term of art. We saw the same thing happen with "zero trust networking".
The fact is that Go doesn't admit memory corruption vulnerabilities, and the way you know that is the fact that there are practically zero exploits for memory corruption vulnerabilities targeting pure Go programs, despite the popularity of the language.
Another way to reach the same conclusion is to note that this post's argument proves far too much; by the definition used by this author, most other higher-level languages (the author exempts Java, but really only Java) also fail to be memory safe.
Is Rust "safer" in some senses than Go? Almost certainly. Pure functional languages are safer still. "Safety" as a general concept in programming languages is a spectrum. But "memory safety" isn't; it's a threshold test. If you want to claim that a language is memory-unsafe, POC || GTFO.
> The fact is that Go doesn't admit memory corruption vulnerabilities
Except it does. This is exactly the example in the article. Type confusion causes it to treat an integer as a pointer & deference it. This then trivially can result in memory corruption depending on the value of the integer. In the example the value "42" is used so that it crashes with a nice segfault thanks to lower-page guarding, but that's just for ease of demonstration. There's nothing magical about the choice of 42 - it could just as easily have been any number in the valid address space.
And data races allow all of that. There cannot be memory-safe languages supporting multi-threading that admit data races that lead to UB. If Go does admit data races it is not memory-safe. If a program can end up in a state that the language specification does not recognize (such as termination by SIGSEGV), it’s not memory safe. This is the only reasonable definition of memory safety.
There's a POC right in the post, demonstrating type confusion due to a torn read of a fat pointer. I think it could have just as easily been an out-of-bounds write via a torn read of a slice. I don't see how you can seriously call this memory safe, even by a conservative definition.
Did you mean POC against a real program? Is that your bar?
This is wrong.
I explicitly exempt Java, OCaml, C#, JavaScript, and WebAssembly. And I implicitly exempt everyone else when I say that Go is the only language I know of that has this problem.
(I won't reply to the rest since we're already discussing that at https://news.ycombinator.com/item?id=44678566 )
Happens all the time in math and physics but having centuries of experience with this issue we usually just slap the name of a person on the name of the concept. That is why we have Gaussian Curvature and Riemann Integrals. Maybe we should speak of Jung Memory Safety too.
Thinking about it, the opposite also happens. In the early 19th century "group" had a specific meaning, today it has a much broader meaning with the original meaning preserved under the term "Galois Group".
Or even simpler: For the longest time seconds were defined as fraction of a day and varied in length. Now we have a precise and constant definition and still call them seconds and not ISO seconds.
Yes I mean that was the whole reason they invented rust. If there were a bunch of performant memory safe languages already they wouldn't have needed to.
Haskell in general is a much safer than Rust thanks to its more robust type system (which also forms the basis of its metaprogramming facilities), monads being much louder than unsafe blocks, etc. But data races and deadlocks are one of the few things Rust has over it. There are some pure functional languages that are dependently typed like Idris, and thus far safer than Rust, but they're in the minority and I've yet to find anybody using them industrially. Also Fortnite's Verse thing? I don't know how pure that language is though.
Rust absolutely does make it easier to write high-performance threaded code correctly, though. If your system depends on high amounts of concurrent mutation, Rust definitely makes it easier to write correct code.
On the other hand, a system like STM in Haskell can make it easier to write complex concurrency logic correctly in Haskell than Rust, but it can have very bad performance overhead and needs to be treated with extreme suspicion in performance-sensitive code. It's a huge win for simple expression of complex concurrency, but you have to pay for it somewhere. It can be used in ways where that overhead is acceptable, but you absolutely need to be suspicious in a way that's never a concern in Rust.
Another way to word it: If "Go is memory unsafe" is such a revelation after its been around for 13 years, it's more likely that such a statement is somehow wrong than that nobody's picked up on such a supposedly impactful safety issue in all this time.
As such, the burden of proof that addresses why nobody's ran into any serious safety issues in the last 13 years is on the OP. It's not enough to show some theoretical program that exhibits the issue, clearly that is not enough to cause real problems.
I really don't understand why people get so obsessed with their tools that it turns into a political battleground. It's a means to an end. Not the end itself.
This doesn’t prove a negative, but is probably a good hint that this risk is not something worth prioritizing for Go applications from a security point of view.
Compare this with C/C++ where 60-75% of real world vulnerabilities are memory safety vulnerabilities. Memory safety is definitely a spectrum, and I’d argue there are diminishing returns.
With maintenance being a "large" integer multiple of initial development, anything that brings that factor down is probably worth it, even if it comes at an incremental cost in getting your thing out the door.
Do you? Not every bug needs to be fixed. I've never see a data race bug in documented behaviour make it past initial development.
I have seen data races in undocumented behaviour in production, but as it isn't documented, your program doesn't have to do that! It doesn't matter if it fails. It wasn't a concern of your program in the first place.
That is still a problem if an attacker uses undocumented behaviour to find an exploit, but when it is benign... Oh well. Who cares?
It’s a nice theoretical argument but doesn’t hold up in practice.
I agree with the sentiment that data races are generally harder to exploit, but it _is possible_ to do.
It can be as simple as changing the size of a vector from one thread while the other one accesses it. When executed sequentiality, the operations are safe. With concurrency all bets are off. Even with Go. Hence the argument in TFA.
Show me the exploits based on Go parallelism. This issue has been discussed publicly for 10 years yet the exploits have not appeared. That’s why it's a nice theoretical argument but does not hold up in practice.
Nice strawman though
In the meantime, we thankfully have agency and are free to choose not to use global variables and shared memory even if the platform offers them to us.
Modern languages have the option of representing thread-safety in the type system, e.g. what Rust does, where working with threads is a dream (especially when you get to use structured concurrency via thread::scope).
People tend to forget that Rust's original goal was not "let's make a memory-safe systems language", it was "let's make a thread-safe systems language", and memory safety just came along for the ride.
The Rust we have from 1.0 onwards is not what Graydon wanted at all. Would Graydon's language have been broadly popular? Probably not, we'll never know.
Some more modern languages - eg. Swift – have "sendable" value types that are inherently thread safe. In my experience some developers tend to equate "sendable" / thread safe data structures with a silver bullet. But you still have to think about what you do in a broader sense… You still have to assemble your thread safe data structures in a way that makes sense, you have to identify what "transactions" you have in your mental model and you still have to think about data consistency.
Go can already ensure "consistency of multi-word values": use whatever synchronization you want. If you don't, and you put a race into your code, weird shit will happen because torn reads/writes are fuckin weird. You might say "Go shouldn't let you do that", but I appreciate that Go lets me make the tradeoff myself, with a factoring of my choosing. You might not, and that's fine.
But like, this effort to blow data races up to the level of C/C++ memory safety issues (this is what is intended by invoking "memory safety") is polemic. They're nowhere near the same problem or danger level. You can't walk 5 feet through a C/C++ codebase w/o seeing a memory safety issue. There are... zero Go CVEs resulting from this? QED.
EDIT:
I knew I remembered this blog. Here's a thing I read that I thought was perfectly reasonable: https://www.ralfj.de/blog/2021/11/18/ub-good-idea.html. Quote:
"To sum up: most of the time, ensuring Well-Defined Behavior is the responsibility of the type system, but as language designers we should not rule out the idea of sharing that responsibility with the programmer."
To be fair though, go has a big emphasis on using its communication primitives instead of directly sharing memory between goroutines [1].
For example, is the following program safe, or does it race?
func processData(lines <-chan []byte) {
for line := range lines {
fmt.Printf("processing line: %v\n", line)
}
}
func main() {
lines := make(chan []byte)
go processData(lines)
var buf bytes.Buffer
for range 3 {
buf.WriteString("mock data, assume this got read into the buffer from a file or something")
lines <- buf.Bytes()
buf.Reset()
}
}
The answer is of course that it's a data race. Why?Because `buf.Bytes()` returns the underlying memory, and then `Reset` lets you re-use the same backing memory, and so "processData" and "main" are both writing to the same data at the same time.
In rust, this would not compile because it is two mutable references to the same data, you'd either have to send ownership across the channel, or send a copy.
In go, it's confusing. If you use `bytes.Buffer.ReadBytes("\n")` you get a copy back, so you can send it. Same for `bytes.Buffer.String()`.
But if you use `bytes.Buffer.Bytes()` you get something you can't pass across a channel safely, unless you also never use that bytes.Buffer again.
Channels in rust solve this problem because rust understands "sending" and ownership. Go does not have those things, and so they just give you a new tool to shoot yourself in the foot that is slower than mutexes, and based on my experience with new gophers, also more difficult to use correctly.
>
> But if you use `bytes.Buffer.Bytes()`
If you're experienced, it's pretty obvious that a `bytes.Buffer` will simply return its underlying storage if you call `.Bytes()` on it, but will have to allocate and return a new object if you call say `.String()` on it.
> unless you also never use that bytes.Buffer again.
I'm afraid that's concurrency 101. It's exactly the same in Go as in any language before it, you must make sure to define object lifetimes once you start passing them around in concurrent fashion.
Channels are nice in that they model certain common concurrency patterns really well - pipelines of processing. You don't have to annotate everything with mutexes and you get backpressure for free. But they are not supposed to be the final solution to all things concurrency and they certainly aren't supposed to make data races impossible.
> Even if you use channels to send things between goroutines, go makes it very hard to do so safely
Really? Because it seems really easy to me. The consumer of the channel needs some data to operate on? Ok, is it only for reading? Then send a copy. For writing too? No problem, send a reference and never touch that reference on our side of the fence again until the consumer is done executing.
Seems about as hard to understand to me as the reason why my friend is upset when I ate the cake I gave to him as a gift. I gave it to him and subsequently treated it as my own!
Such issues only arise if you try to apply concurrency to a problem willy-nilly, without rethinking your data model to fit into a concurrent context.
Now, would the Rust approach be better here? Sure, but not if that means using Rust ;) Rust's fancy concurrency guarantees come with the whole package that is Rust, which as a language is usually wildly inappropriate for the problem at hand. But if I could opt into Rust-like protections for specific Go data structures, that'd be great.
"2. Shared buffer causes race/data reuse You're writing to buf, getting buf.Bytes(), and sending it to the channel. But buf.Bytes() returns a slice backed by the same memory, which you then Reset(). This causes line in processData to read the reset or reused buffer."
I mean, you're basically passing a pointer to another thread to processData() and then promptly trying to do stuff with the same pointer.
select {
case <-ctx.Done():
return context.Cause(ctx)
case msg := <-ch:
...
}This isn't anything special, if you want to start dealing with concurrency you're going to have to know about race conditions and such. There is no language that can ever address that because your program will always be interacting with the outside world.
In contrast to the go project itself, external users of Go frequently make strong claims about Go's memory safety. fly.io calls Go a "memory-safe programming language" in their security documentation (https://fly.io/docs/security/security-at-fly-io/#application...). They don't indicate what a "memory-safe programming language" is. The owners of "memorysafety.org" also list Go as a memory safe language (https://www.memorysafety.org/docs/memory-safety/). This later link doesn't have a concrete definition of the meaning of memory safety, but is kind enough to provide a non-exaustive list of example issues one of which ("Out of Bounds Reads and Writes") is shown by the article from this post to be something not given to us by Go, indicating memorysafety.org may wish to update their list.
It seems like at the very least Go and others could make it more clear what they mean by memory safety, and the existence of this kind of error in Go indicates that they likely should avoid calling Go memory safe without qualification.
Yeah... I was actually surprised by that when I did the research for the article. I had to go to Wikipedia to find a reference for "Go is considered memory-safe".
Maybe they didn't think much about it, or maybe they enjoy the ambiguity. IMO it'd be more honest to just clearly state this. I don't mind Go making different trade-offs than my favorite language, but I do mind them not being upfront about the consequences of their choices.
At the time Go was created, it met one common definition of "memory safety", which was essentially "have a garbage collector". And compared to c/c++, it is much safer.
This is the first time I hear that being suggested as ever having been the definition of memory safety. Do you have a source for this?
Given that except for Go every single language gets this right (to my knowledge), I am kind of doubtful that this is a consequence of the term changing its meaning.
How many exploits or security issues have there been related to data race on dual word values? I work with Go for the last 10 years and I never heard of such issues. Not a single time.
For some examples, Rust (although this is not specific to it) uses stack guard pages to detect stack overflows by _forcing_ a segfault (as opposed to reading/writing arbitrary memory after the usual stack). Some JVMs also expect and handle segfaults when dereferencing null pointers, to avoid always paying the cost for checking them.
The violation occurs if the program keeps running after having violated a memory safety property. If the program terminates, then it can still be memory safe in the definition.
Segfaults has nothing to do with the properties. There's some languages or some contexts in which segfaults is part of the discussion, but in general, the theory doesn't care about segfaults.
Memory safe languages make it harder to segfault but that's a consequence, not the primary goal. Segfaults are just another memory protection. If memory bugs only ever resulted in segfaults the instant constraints are violated, the hardware protections would be "good enough" and we wouldn't care the same way about language design.
Now the big question, as you mention, is "can it be exploited?" My assumption is that it can, but that there are much lower-hanging fruits. But it's just an assumption, and I don't even know how to check it.
Can you violate memory safety in C# without unsafe{} blocks (or GCHandle/Marshal/etc.)? (No.)
Can you write thread-unsafe code in C# without using unsafe{} blocks etc.? (Yes, just make your integers race.)
Doesn't that contradict the claim that you can't have memory safety without thread safety?
- The above is true
- If I'm writing something using a systems language, it's because I care about performance details that would include things like "I want to spawn and curate threads."
- Relative to the borrow-checker, the Rust thread lifecycle static typing is much more complicated. I think it is because it's reflecting some real complexity in the underlying problem domain, but the problem stands that the description of resource allocation across threads can get very hairy very fast.
The same memory corruption gotchas caused by threads exist, regardless of whether there is a borrow checker or not.
Rust makes it easier to work with non-trivial multi-threaded code thanks to giving robust guarantees at compile time, even across 3rd party dependencies, even if dynamic callbacks are used.
Appeasing the borrow checker is much easier than dealing with heisenbugs. Type system compile-time errors are a thing you can immediately see and fix before problems happen.
OTOH some racing use-after-free or memory corruption can be a massive pain to debug, especially when it may not be possible to produce in a debugger due to timing, or hard to catch when it happens when the corruption "only" mangles the data instead of crashing the program.
This is an aesthetics argument more than anything else, but I don't think the type theory around threads and memory safety in Rust is as "cooked" as single-thread borrow checking. The type assertions necessary around threads just get verbose and weird. I expect with more time (and maybe a new paradigm after we've all had more time to use Rust) this is a solvable problem, but I personally shy away from Rust for multi-threaded applications because I don't want to please the type-checker.
Just wondering.
Realistically that would be quite rare since it is obvious that this is unprotected shared mutable access. But interesting that such a conversion without unsafe may happen. If it segfaults all the time though then we still have memory safety I guess.
The article is interesting but I wish it would try to provide ideas for solutions then.
But I don't agree with:
> I will argue that this distinction isn’t all that useful, and that the actual property we want our programs to have is absence of Undefined Behavior.
There is plenty of undefined behavior that can't lead to violating memory safety. For example, in many languages, argument evaluation order is undefined. If you have some code like:
foo(print(1), print(2));
In some languages, it's undefined as to whether "1" is printed before "2" or vice versa. But there's no way to violate memory safety with this.I think the only term the author needs here is "memory safety", and they correctly observe that if the language has threading, then you need a memory model that ensures that threads can't break your memory safety.
Go lacks that. It seems to be a rare problem in practice, but if you want guarantees, Go doesn't give you them. In return, I guess it gives you slightly faster execution speed for writes that it allows to potentially be torn.
It was changed as part of the C++11 memory model and now, as you said, there is a sequenced-before order, it is just unspecified which one it is.
I don't know much about C, but I believe it was similarly changed in C11.
You are mixing up non-determinism and UB. Sadly that's a common misunderstanding.
See https://www.ralfj.de/blog/2021/11/18/ub-good-idea.html for an explanation of what UB is, though I don't go into the distinction to non-determinism there.
Memory safety is a much bigger problem.
That's a too low bar to clear to call it safe.
Java got this right. Fil-C gets it right, too. So, there is memory safety without thread safety. And it’s really not that hard.
Memory safety is a separate property unless your language chooses to gate it on thread safety. Go (and some other languages) have such a gate. Not all memory safe languages have such a gate.
> At this point you might be wondering, isn’t this a problem in many languages? Doesn’t Java also allow data races? And yes, Java does allow data races, but the Java developers spent a lot of effort to ensure that even programs with data races remain entirely well-defined. They even developed the first industrially deployed concurrency memory model for this purpose, many years before the C++11 memory model. The result of all of this work is that in a concurrent Java program, you might see unexpected outdated values for certain variables, such as a null pointer where you expected the reference to be properly initialized, but you will never be able to actually break the language and dereference an invalid dangling pointer and segfault at address 0x2a. In that sense, all Java programs are thread-safe.
And:
> Java programmers will sometimes use the terms “thread safe” and “memory safe” differently than C++ or Rust programmers would. From a Rust perspective, Java programs are memory- and thread-safe by construction. Java programmers take that so much for granted that they use the same term to refer to stronger properties, such as not having “unintended” data races or not having null pointer exceptions. However, such bugs cannot cause segfaults from invalid pointer uses, so these kinds of issues are qualitatively very different from the memory safety violation in my Go example. For the purpose of this blog post, I am using the low-level Rust and C++ meaning of these terms.
Java is in fact thread-safe in the sense of the term used in the article, unlike Go, so it is not a counterexample to the article's point at all.
The title is wrong. That's important.
> Java is in fact thread-safe in the sense of the term used in the article
The article's notion of thread safety is wrong. Java is not thread safe by construction, but it is memory safe.
Basically, functional languages make it easier to write code that is safe. But they aren't necessarily the fastest or the easiest to deal with. Erlang and related languages are a good example. And they are popular for good reasons.
Java got quite a few things right but it took a while for it to mature. Modern day Java is quite a different beast than the first versions of Java. The Thread class, API, and the language have quite a few things in there that aren't necessarily that great of an idea. E.g. the synchronized keyword might bite you if you are trying to use the new green threads implementation (you'll get some nice deadlocks if you block the one thread you have that does everything). The modern java.concurrent package is implemented mostly without it.
Of course people that know their history might remember that green threads are actually not that new. Java did not actually support real threads until v1.1. Version 1.0 only had green threads. Those went out of fashion for about two decades and then came back with recent versions. And now it does both. Which is dangerous if you are a bit fuzzy on the difference. It's like putting spoilers on your fiesta. Using green threads because they are "faster" is a good sign that you might need to educate yourself and shut up.
On the JVM, if you want to do concurrent and parallel stuff, Scala and Kotlin might be better options. All the right primitives are there in the JVM of course. And Java definitely gives you access to all it. But it also has three decades of API cruft and a conservative attitude about keeping backwards compatible with all of that. And not all of it was necessarily that all that great. I'm a big fan of Kotlin's co-routine support that is rooted in a lot of experience with that. But that's subjective of course. And Scala-ists will probably insist that Scala has even better things. And that's before we bring up things like Clojure.
Go provides a good balance between ease of use / simplicity and safety. But it has quite a few well documented blind spots as well. I'm not that big of a fan but I appreciate it for what it is. It's actually a nice choice for people that aren't well versed in this topic and it naturally nudges people in a direction where things probably will be fine. Rust is a lot less forgiving and using it will make you a great engineer because your code won't even compile until you properly get it and do it right. But it won't necessarily be easy (humbled by experience here).
With languages the popular "if you have a hammer everything looks like a nail" thing is very real. And stepping out of your comfort zone and realizing that other tools are available and might be better suited to what you are trying to do is a good skill to have.
IMHO python is actually undervalued. It was kind of shit at all of this for a long time. But they are making a lot of progress modernizing the language and platform and are addressing its traditional weaknesses. Better interpreting and jit performance, removing the GIL, async support that isn't half bad, etc. We might wake up one day and find it doing a lot of stuff that we'd traditionally use JVM/GO/Rust for a few years down the line. Acknowledging weaknesses and addressing those is what I'm calling out here as a very positive thing. Oddly, I think there are a lot of python people that are a bit conflicted about progress like this. I see the same with a lot of old school Java people. You get that with any language that survives that long.
Note how I did not mention C/C++ here so far. There's a lot of it out there. But if you care about safety, you should probably not go near it. I don't care how disciplined you are. Your C/C++ code has bugs. Any insistence that it doesn't just means you haven't found them yet. Possibly because you are being sloppy looking for them. Does it even have tests? There are whole classes of bugs that we can prevent with modern languages and practices. It's kind of negligent and irresponsible not to. There are attempts to make C++ better of course.
The issue with Python isn't just the GIL and lack of support for concurrency. It uses dynamic types (i.e. variant types) for everything. That's way too slow, it means every single variable access must go through a dispatch step. About the only thing Python has going for it is the easy FFI with C-like languages.
Safe Rust doesn't seem that limited to me.
I don't think any of the C# work I do wouldn't be possible in Rust, if we disregard the fact that the rest of the team don't know Rust.
Most of the programs you eliminate when you have these "onerous" requirements like memory safety are nonsense, they either sometimes didn't work or had weird bugs that would be difficult to understand and fix - sometimes they also had scary security implications like remote code execution. We're better off without them IMNSHO.
Go (and previously Swift) fails at this. There data races can result in UB and thus break memory safety
I worry about the Win95-era "Microsoft Pragmatism" at work and a concrete example which comes to mind is nullability. In the nice modern software I often work on I can say some function takes a string and in that program C# will tell me that's not allowed to be null, it has to be an actual string - a significant engineering benefit. But, the CLR does not enforce such rules, so that function may still receive a null instead e.g. if called by some ten year old VB.NET code which has no idea about "nullability" and so just fills out a null for that parameter anyway.
Of course the CLR memory model might really be set in stone and 100% proof against such problems, but I haven't seen anything to reassure me as I did for Java and I fear that if it were convenient for Windows to not quite do that work they would say eh, good enough.
It's saying the opposite – that if you want memory safety, thread safety is a requirement – and Java and C# refute it.
A memory safe, managed language doesn't become unsafe just because you have a race condition in a program.
Like, say, reading and writing several related shared variables without a mutex.
Say that the language ensures that the reads and writes themselves of these word-sized variables are safe without any lock, and that memory operations and reclamation of memory are thread safe: there are no low-level pointers (or else only as an escape hatch that the program isn't using).
The rest is your bug; the variable values coming out of sync with each other, not maintaining the invariant among their values.
It could be the case that a thread-unsafe program breaks a managed run-time, but not an unvarnished truth.
A managed run-time could be built on the assumption that the program will not create two or more threads such that those threads will invoke concurrent operations on the same objects. E.g. a managed run time that needs a global interpreter lock, but which is missing.
The author's point is that Go is not a memory safe language according to that distinction.
There are values that are a single "atomic" write in the language semantics (interface references, slices) that are implemented with multiple non-atomic writes in the compiler/runtime. The result is that you can observe a torn write and break the language's semantics.
If the language and its runtime let me break their invariant, then that's their bug, not mine. This is the fundamental promise of type-safe languages: you can't accidentally break the language abstraction.
> It could be the case that a thread-unsafe program breaks a managed run-time, but not an unvarnished truth.
I demonstrated that the Go runtime is such a case, and I think that should be considered a memory safety violation. Not sure which part of that you disagree with...
No it isn't, because the torn write cannot have arbitrary effects that potentially break the program. It only becomes such if you rely on such a variable to establish an invariant about memory that's broken if a torn write occurs (such as by encoding a ptr+len in it), which is just silly. Don't do that!
The bad news ought to be obvious, this "goal" is not achievable, it's a fantasy that somehow we should be able to see the future, divine that some value stored won't be needed in the future and thus we don't need to store it. Goals like "We shouldn't store things we can't even refer to" are already solved in languages used today, so a goal to "not have memory leaks" refers only to that unachievable fantasy.
There is no pedestrian safety without mandatory helmet laws.
There is no car safety without driving a tank.
The Wikipedia definition of memory safety is not the Go definition of memory safety, and in Go programs it is the Go definition of memory safety that matters.
The program in the article is obviously racy according to the Go language spec and memory model. So this is all very much tilting at windmills.
(But also, it'd be kind of silly for every language to make up their own definition of memory safety. Then even C is memory safe, they just have to define it the right way. ;)
Relevant bit for the OP is probably:
A data race is defined as a write to a memory location happening concurrently with another read or write to that same location, unless all the accesses involved are atomic data accesses as provided by the sync/atomic package.
Which describes exactly what is happening in the OP's program: func repeat_get() {
for {
x := globalVar // <-- unsynchronized read of globalVar
x.get() // <-- unsynchronized call to Thing.get()
}
}
By itself this isn't a problem, these are just reads, and you don't need synchronization for concurrent reads by themself. The problem is introduced here: func repeat_swap() {
var myval = 0
for {
globalVar = &Ptr { val: &myval } // <-- unsynchronized write to globalVar
globalVar = &Int { val: 42 } // <-- unsynchronized write to globalVar
}
}
func main() {
go repeat_get() // <-- one goroutine is doing unsynchronized reads
repeat_swap() // <-- another goroutine is doing unsynchronized writes
}
Just a (chef's kiss) textbook example of a data race, and a clearly unsound Go program. I don't know how or why the OP believes "this program ... [is] according to Wikipedia memory-safe" -- it very clearly is not.But, you know, I think everyone here is basically talking past each other.