let role = Role {
name: "basic",
flag: 1,
disabled: false,
};
The language tries to prevent you from interacting with a `Role` object that's not fully initialized. `mem::zero()` could work, but then you'll have to turn the `&'static str` into an `Option<&'static str>` or a raw pointer, to indicate that it might be null. You could also add `#[derive(Default)]` to the struct, to automatically get a `Role::default()` function to create a `Role` with and then modify the fields afterwards, if you want to set the fields in separate statements for some reason: let mut role = Role::default();
role.name = "basic";
role.flag = 1;
role.disabled = false;
And even with `MaybeUninit` you can initialize the whole struct (without `unsafe`!) with `MaybeUninit::write`. It's just that partially initializing something is hard to get right, which is the point of the article I guess. But I wonder how commonly you would really want that, as it easily leads to mistakes.Yes, working with uninitialized memory is tedious. But that isn't something you ever have to do. If you're translating some C to Rust, write it using Rust idioms, instead of trying to preserve every call to malloc/free and every access to uninitialized memory.
Anything small enough to clearly make points about unsafe Rust is almost certainly small enough to be done in safe Rust, defeating the purpose.
const struct role r = {
.name = "basic",
.flag = 1,
.disabled = false,
};
(of course this doesn't give you uninitialized memory in case new items are added to the struct, but why would one ever want that?)I remember that after reading the Rust book, one of the first things I tried to do was to load a struct from a file. Like pseudocode:
struct MY_STRUCT my_struct;
read(file, &my_struct, sizeof(my_struct));
2 lines of code.... It should be simple, right? RiGhT?! Well, the first stackoverflow answer involved unsafe and a bunch of other stuff I didn't understand. And also I thought that as a beginner I shouldn't start fiddling with unsafe right away (otherwise, what's the point? I'm trying to move away from C). Then I learned that structs are not laid out as declared (OMG!), etc.So, thinking this went beyond my skills I left it aside and tried to make a nice console logging library for my projects. It should be simple! I tried to create a variadic function and you can guess how it went.
I am out of luck with Rust.
Well, no. If you want to have memory safe subset, you absolutely cannot initialize structs with random bag of bytes in general case. C let's you cut corners here, but in Rust you need to implement (de)serializing logic (no need for unsafe).
>Then I learned that structs are not laid out as declared (OMG!), etc. >It should be simple! I tried to create a variadic function and you can guess how it went.
This is only surprising if you have this weird assumption that things should work like they do in C + some extra.
One of the central philosophies of Rust is that it should not be possible to execute undefined behavior using only safe code. Rust's underlying core semantics end up being very similar to C's semantics, at least in terms of where undefined behavior can arise, and we can imagine Rust's references as being wrappers around the underlying pointer type that have extra requirements to ensure that they can be safely dereferenced in safe code without ever causing UB.
So consider a simple pointer dereference in C (*p)... how could that cause UB? Well, the obvious ones are that the pointer could be out-of-bounds or pointing to an expired memory location. So references (& and &mut) most point to a live memory location, even in unsafe code. Also pretty obviously, the pointer would be UB were it unaligned, so a Rust reference must be properly aligned.
Another one that should be familiar from the C context is that the memory location must be initialized. So the & reference in Rust means that the memory location must also be initialized... and since &mut implies &, so must &mut. This part is probably genuinely surprising, since it's a rule that doesn't apply to C.
The most surprising rule that applies here as well is that the memory location cannot be a trap representation (to use C's terminology). Yes--C has the same requirement here, but most people probably don't come across a platform that has trap representations in C. The reason why std::mem::uninitialized was deprecated in favor of MaybeUninit was that Rust has a type all of whose representations are trap representation (that's the ! type).
In short, the author is discovering two related issues here. First, the design of Rust is to push all of the burden of undefined behavior into unsafe code blocks, and the downside of that is that most programmers probably aren't sufficiently cognizant of UB rules to do that rule. Rust also pushes the UB of pointers to reference construction, whereas C makes most of its UB happen only on pointer dereference (constructing unaligned pointers being the exception).
The second issue is that Rust's syntax is geared to making safe Rust ergonomic, not unsafe Rust. This means that using the "usual" syntax rules in unsafe Rust blocks is more often than not UB, even when you're trying to avoid the inherent UB construction patterns. Struct projection (given a pointer/reference to a struct, get a pointer/reference to a field) is especially implicated here.
These combine when you deal with uninitialized memory references. This is a reasonably common pattern, but designing an always-safe abstraction for uninitialized memory is challenging. And Rust did screw this up, and the stability guidelines means the bad implementations are baked in for good (see, e.g., std::io::Read).
> [..]
> So why does this type not support zero initialization? What do we have to change? Can zeroed not be used at all? Some of you might think that the answer is #[repr(C)] on the struct to force a C layout but that won't solve the problem.
The type of the first field was switched to a type (&str) that specifically promises it is never null. If the original type (a pointer) was kept, or a Option<&str> was used, mem::zero would've worked fine.
If you want safe access to that pointer then wrap it in a struct with an accessor method
This isn't a very good motivating example but I suppose it does the job of showing the various hoops one has to jump through when using unsafe.
I think right now the approach is to make unsafe "safe" (ie std::mem::uninitialized -> MaybeUninit) at the cost of complex, and eventually to build out improved helpers and abstractions. Obviously this is still ongoing.
But also, just don't write unsafe? It's very easy to avoid.
Yeah, there's a weird subset of developers who insist on mixing unsafe and safe code even when they're presented equally performant, safe alternatives. One such example was the Actix framework, where the lead dev refused to merge any fixes for his unsafe code. Eventually, so many merge requests showed up to fix his broken code that he just gave up the project altogether and let the community take over.
If you want to write unsafe code, I think that's perfectly fine, but Rust is not going to cater to your desires. C and C++ will give you the tools you need with the conveniences you want.
This is a bit of a odd suggestion to me, honestly, you're basically saying Rust is not intended to be a C/C++ replacement. There is definitely a reality that not everything can be written in completely safe Rust, and a lot of the places that Rust could be the most beneficial (Ex. Linux Kernel) are going to require using it.
Handling uninitialized memory is hard in C++ (and C), too.
You just don't notice and accidentally do it slightly wrong (mainly in C++, in C it's harder to mess up).
struct MyStruct foo = {};
This has the effect of initializing all members to zero (or, more precisely, the value which is the same as for objects that have static storage duration [0]).[0] https://gcc.gnu.org/onlinedocs/gcc/Designated-Inits.html
See also Stop Memsetting Structures: https://news.ycombinator.com/item?id=19766930
I'm not aware of as many uses of field by field initialization of a struct but there is an example similar to this blog in the docs[1] (without the alignment considerations.)
That said my read has been the complexity is accidental as a result of language decisions to improve safe rust. MaybeUnit was only defined 3 years ago when it was discovered that mem::uninitialized /zeroed resulted in undefined behavior when used with bools. [2][3]
[1] https://doc.rust-lang.org/std/mem/union.MaybeUninit.html#ini...
You still sometimes need it like:
- when highly optimizing some algorithms
- doing FFI
So places you find it include some aync runtimes, some algorithm libraries, the standard library.
Still often times you initialize it by fully writing it, not by writing fields.
Anyway rules are simple:
1. use `ptr::write` instead of `ptr =`
2. use `addr_of_mut!(ptr.x)` instead of `&ptr.x` to get field pointers
3. uhm, `packed` structs are a mess, if you have some you need to take a lot of additional care, this is not limited to rust but also true for C/C++
Also you do not need `#[repr(C)]`, while the rust-specification is pending and as such `repr(Rust)` is pretty much undefined you still can expect fields to be aligned (as else you would have unaligned-`&` which is quite a problem and would likely cause a bunch of breakage through the eco-system).
I'd actually like to qualify that a bit. Doing unsafe is not especially hard, just use pointers everywhere instead of references. That's exactly like C (which doesn't even have references), but the syntax is clunkier, you have to use function calls instead of concise operators like * and ->. What is hard is finely interleaving safe and unsafe Rust, as the author is trying to do. That's difficult because the unsafe code has to upload all the safety invariants of safe Rust, and those are indeed complicated.
You are being nice, but even if there is a documentation error, we can prove that if safe rust isn't completely broken, and `& x.field` is allowed in safe Rust, then fields must be aligned. It is just preposterous Rust would be more broken than C in this regard.
> but the syntax is clunkier, you have to use function calls instead of concise operators like * and ->.
Yes I agree, the syntax does suck. I see the macros use an unstable &raw, that would be more concise.
I think would be really good is if x->y in Rust matched &x->y in C. That is nicely orthogonal to dereferencing, and always safe.
> There are no guarantees of data layout made by this representation.
So that being the case, there's not really anything stopping them from introducing a situation where an unaligned field in a struct is created in the future. Of course I can't imagine why they would do that, but then maybe my imagination just isn't good enough. I think the author's point here (which is a good one in my opinion) is that when writing `unsafe` you're not supposed to rely on stuff that seems like it should be true, you're supposed to rely on stuff that's guaranteed to always be true, which with Rust isn't all that clearly defined.
[0]: https://doc.rust-lang.org/reference/type-layout.html#the-def...
This isn't just a matter of things "seeming". It is quite literally an implication of Safe Rust works => field offsets must be aligned. There is no other way for safe Rust to be safe, other than alignment not mattering at all because all accesses are careful to pessimistically not rely on it.
I am sorry, but this hypothesis is just completely outside the Overton Window.
It does however makes things a bit clunky since the unsafe bits need to ensure a “safe” state throughout the whole block and not only by the end of it.
Why is this problematic? (I presume there is a fatal flaw as it seems too obvious of a solution..)
auto name = reinterpret_cast<std::string >(malloc(sizeof(std::string))); memset(name, 0, sizeof(std::string); *name = "basic";
But on the stack.
Where is this coming from? It's literally not true. The MIR for this has:
((*_3).0: &str) = const "basic";
((*_3).2: u32) = const 1_u32;
((*_3).1: bool) = const false;
So it's only going to do a raw offset and then assign to it, which is identical to `*ptr::addr_of_mut!((*role).field) = value`.Sadly there's no way to tell miri to consider `&mut T` valid only if `T` is valid (that choice is not settled yet, AFAIK, at the language design level), in order to demonstrate the difference (https://github.com/rust-lang/miri/issues/1638).
The other claim, "dereferencing is illegal", is more likely, but unlike popular misconception, "dereference" is a syntactic concept, that turns a (pointer/reference) "value" into a "place".
There's no "operation" of "dereference" to attach dynamic semantics to. After all, `ptr::addr_of_mut!(*p).write(x)` has to remain as valid as `p.write(x)`, and it does literally contain a "dereference" operation (and so do your field projections).
So it's still inaccurate. I believe what you want is to say that in `place = value` the destination `place` has to hold a valid value, as if we were doing `mem::replace(&mut place, value)`. This is indeed true for types that have destructors in them, since those would need to run (which in itself is why `write` on pointers exists - it long existed before any of the newer ideas about "indirect validity" in recent years).
However, you have `Copy` types there, and those are definitely not different from `<*mut T>::write` to assign to, today. I don't see us having to change that, but I'm also not seeing any references to where these ideas are coming from.
> I'm pretty sure we can depend on things being aligned
What do you mean "pretty sure"? Of course you can, otherwise it would be UB to allow safe references to those fields! Anything else would be unsound. In fact, this goes hand in hand with the main significant omission of this post: this is not how you're supposed to use `MaybeUninit`.
All of this raw pointer stuff is a distraction from the fact that what you want is `&mut MaybeUninit<FieldType>`. Then all of the things about reference validity are necessarily true, and you can safely initialize the value. The only `unsafe` operation in this entire blog post, that isn't unnecessarily added in, is `assume_init`.
What the author doesn't mention is that Rust fails to let you convert between `&mut MaybeUninit<Struct>` and some hypothetical `&mut StructBut<replace Field with MaybeUninit<Field>>` because the language isn't powerful enough to do it automatically. This was one of the saddest things about `MaybeUninit` (and we tried to rectify it for at least arrays).
This is where I was going to link to a custom derive that someone has written to generate that kind of transform manually (with the necessary check for safe field access wrt alignment). To my shock, I can't find one. Did I see one and did it have a funny name? (the one thing I did find was a macro crate but unlike a derive those have a harder time checking everything so I had to report https://github.com/youngspe/project-uninit/issues/1)
This in Rust:
let mut role: Role = mem::zeroed();
Is not the same as this in C: struct role r;
C does not zero initialize.That said, I am curious why different people have these different feelings. One aspect is likely rooted in the fact all of our brains are different. But I also wonder if first impressions play a big role here. A good example of a cryptic rust error is the `expected type Foo, but found type Foo` error message which is very inscrutable, especially to a new users. There are also some lifetime errors that can be hard to understand.
I wonder if someone encounters these type of messages very early on in their learning experiences, the unpleasantness of having to decipher them colors the rest of their learning experiences.
Path dependency has big impact on what seems natural, intuitive, etc. Part of that is what you've done before, and part is first impressions, and part of it is your approach to learning (or the approach taking to teaching you) the subject at hand.
I’ve approached Rust via different books and tutorials before and come up with the “it’s awesome, but too hard” feeling and set it aside.
Recently I've been trying Hands-on Rust [0] and going off to the side from it and Rust is clicking pretty well. Not sure if the book is a better fit for me, if the past false starts have prepared the ground, or what specifically changed.
On the other hand, if you're fine with a garbage collector, which most people are most of the time, then Go is going to feel more natural. For some people, Go is more comparable to Python than Rust, because of this one big difference.
Does Rust actually give an error like `expected type Foo, but found type Foo`, as in both types are the same in the error? I don't think I've seen that before, but I don't write much Rust.
If both types are the same, what does the error mean?
In fairness, I have found the compiler errors to be extremely helpful. They often tell me exactly what to fix. But honestly, they shouldn't have to do that. The syntax should have been obvious from the beginning, as it is in most programming languages.
That hasn't been the case for years. The 2018 edition officially stabilized Non-Lexical Lifetimes, allowing tons of valid programs to work. There have been a lot of other improvements since then to address papercuts.
At this point Rust is a pretty easy language to learn imo.
Over the years the following C++ aspects have and continue to drive me nuts. On a scale of (Sarcasm coming next) 10:
* (7/10) build time, and dependency management with the 81 million tools, formats, approaches to deal with this.
* (1/10) 21 page error sets resulting from single word typos in templated code: even Egyptologists ask how do we put up with that? At least we have nice rocks to look at. Yah, they have rocks. How come we don't have rocks?
* (1/10) Code decl duplication between .h/.cpp
* (1/10) Long, pointless C++ errors. if you have an orthodox background the guilt trip C++ lays on for mismatched function calls is legion; you really feel it. It tells you such-n-such function could not be matched ... but look at the 42 million function calls you could have made ... should have made ... and it's killing me you didn't make; you seeing the effort I'm putting into this? It's killing me ... I just need the cop to say it straight: you screwed up. Ticket. See in court. Have a nice day. I prefer one liners.
I'm in no rush to waste time on Rust. I'd rather stick to GO where I can, Zig when I can. At the office we're 65% C++, 25% C. Stack overflows, memory corruptions do occur periodically; I've sorted several of those out. But the new code is C++ which makes heavy use of STL. Much better. There is growing adoption in GO and Rust, but the vast amount of C/C++ code means the apple will not fall too far from the tree.
So, it's quite possible for various people to feel more comfortable using Golang than Rust, and the opposite can be true as well, where using Rust is preferred over Golang. At the end of the day, it will usually come down to individual or corporate priorities.
Another of the newer languages that fall into a similar usage space would be Vlang (https://github.com/vlang/v). It being debatably easier to learn and use, in the context of languages that more easily interact with C.
[...]
> The syntax seems overcomplicated, the compiler errors are cryptic, the IDE is not helpful.
Yes, Rust is hard to learn. Rust does _seem_ over-complicated.
However I like to compare Rust to exercise. You need to do a bit of it before you start reaping the benefits of it.
If you suspend your judgement for a bit and try to write some Rust, starting from the very beginning you will find that:
- Rust is actually a small language at its core, unlike the monstrosity that is C++ . You don't really need Advanced Rust to be productive. Use Advanced Rust only when you're... advanced
- Rust actually is very consistent
- The Rust compiler is actually very helpful. It's the least cryptic compiler I've met. But its OK if you feel that now as you're just beginning your journey with Rust
Avoid the temptation to "read" Rust from a book. Try to _do_ Rust. Otherwise it might overwhelm you. Simply keep adding Rust techniques to your arsenal as you mature in your usage of Rust.
Learning Rust changed the way I look at programing. Rust is a beautiful language. As a random example, just look at the the Firecracker VMM written in Rust -- https://github.com/firecracker-microvm/firecracker . It would have been able to very difficult for me to understand the codebase if it were written in C/C++!
Rust is one of those rare languages I've encountered that if the code compiles, there is a high probability it will work. The type system is that good!
TL;DR Persist and you will reap the rewards with Rust.
I've been watching the videos by Andreas King on SerenityOS and the code is so clean that at first I wondered what programming language I was even looking at. I see the SerenityOS codebase as proof that if C++ programmers wanted to write modern, elegant, readable code, they definitely could.
In practice, though, most C++ programs are full of legacy code or are written by people who don't necessarily know about or agree with modern ways to program C++. It's easy to write beautiful code if you also wrote the memory manager and standard library in modern C++, but most people don't have that luxury.
By being created with a more modern standard library, Rust has an advantage over C++. There is no legacy code to remain compatible with and there is no real way to write "old-fashioned" Rust because the project hasn't existed for long enough. I've seen plenty of terrible, ugly Rust, most of it in my own personal projects. The strictness of the language and standard toolset helps, but it's far from a guarantee that enterprise Rust will be readable and clear.
I find rust and modern C++ codebases equally hard to understand. How can you be so sure that the reason you find it easier to understand is not simply because you know rust better than C or C++?
I used to be the opposite in the ancient times. Would read a book and then start programming. But then the books were relatively tiny. Now most languages has matured to the state of having an insane amount of features. And the result is definitely what you say. Just read some brief overview on basic language constructs and then proceed by learning on on-need basis as you progress.
Among the others I program in C++ for example but I would shoot myself if asked to read something resembling its complete description. Sorry I have a life to live. And I am using only subset of C++ that solves my particular needs. If I feel my code does not look nice when doing some particular stuff then the time comes to do some more reading.
Perhaps the problem was that I was overconfident and immediately started with pretty advanced Rust - writing a web assembly that performs big data analysis using the "polars" library.
Absolutely not, you can still use a mutable reference:
let role = &mut *uninit.as_mut_ptr();
role.name = "basic";> A mutable reference must also never point to an invalid object, so doing let role = &mut *uninit.as_mut_ptr() if that object is not fully initialized is also wrong.
I'm curious who's right here, because I've seen your pattern in code recently!
> Incorrect usage of this method:
let mut x = MaybeUninit::<Vec<u32>>::uninit();
let x_vec = unsafe { &mut *x.as_mut_ptr() };
// We have created a reference to an uninitialized vector! This is undefined behavior.
Also, above, it explicitly describes the intended API for partially initializing a struct: https://doc.rust-lang.org/stable/std/mem/union.MaybeUninit.h... (*role).name = "basic";