... how so ? they were just CoW which is actually I think the better choice most of the time... now there are copies all over the place
So I've got the string "IR Baboon big star of cartoon" and I take references into it, which are cheap and then you use your C++ 98 copy constructor to get another string, which of course also says "IR Baboon big star of cartoon", when you took it -- and then I scrawl "I AM Weasel" on top of my string using my reference and now your string was changed because it was COW.
If you liked COW for this purpose Rust has std::borrow::Cow which is a smart pointer with similar flavour, Cow<T> is a sum type that's either a thing you own T (and thus you could modify it) or it's a reference, perhaps &T (and thus you can't modify it) but which promises you could get an owned thing (e.g. for strings by deep-copying the string) if you need one. Methods that would be OK to call on the immutable reference (e.g. asking how many times an ASCII digit appears in the string) work on Cow<T> and if you find you need to mutate it (maybe in a rare case) you can ask the Cow for the mutable version, if it already had the owned version you get that, if not it will make one for you.
Rust's traits kick in here, Cow<T> requires T: ToOwned, which is a trait saying "I can make an immutable reference to T into a new thing T you own", obviously types you shouldn't do that to simply do not implement ToOwned and so you can't make a Cow of those types. The standard library provides in particular an implementation of ToOwned for &str which makes Strings from it.
> and then I scrawl "I AM Weasel" on top of my string using my reference and now your string was changed because it was COW.
I mean, that's the point of references... no ? If I wanted a different object I'd make a copy.
Like, even with just one string, without any CoW, your post makes it sound like you'd be surprised than if you had:
void set_some_config(const char*);
char* get_some_config();
std::string s = "foo";
set_some_config(&s[1]);
s = "bar";
get_some_config();
you'd get "ar" in get_some_config().In the explanation I posted, you do make a copy to get a different object "you use your C++ 98 copy constructor to get another string".
The problem happens because both strings share the same bytes to represent the text "IR Baboon big star of cartoon" as part of the COW optimisation. But my reference can scribble on this shared text.
I don't see how your get_some_config is similar at all. Notice that with C++ 11 strings, the copy constructor gives you a deep copy of that "IR Baboon" text and so my references can't smash your string.
Is a single atomic increment really that expensive? I mean we are not even talking about a full memory barrier here, just the atomic increment's implied acquire and release on the single variable. Other operations not dependent on a subsequent read could still be re-ordered in both directions.
And also keep in mind that the alternative was copying the whole string instead. Which means both heap memory allocation (which is often pretty expensive, even with per-core heaps), plus the actual copying. Unless a platform has a terrible implementation of atomic increment, or you have a std::string that is frequently getting copied on multiple cores (so as to have meaningful contention), I would have expected the actual copying implementation to be slower. But I'm not super familiar with the timings of these things, so i certainly could be mistaken.
My understanding was that the change was more for about being able to set proper bounds on some operations, ensuring .c_str() is O(1), and not O(n) sometimes, and similarly with string writes, etc.