1) There isn't transparent integration with IO in the runtime as in Go or Haskell. Rust probably won't ever do this because although such a model scales well in general, it does create overhead and a runtime.
2) OS threads are difficult to work with compared to a nice M:N threading abstraction (which again are the default in Go or Haskell). OS threads leads to lowest common denominator APIs (there is no way to kill a thread in Rust) and some difficulty in reasoning about performance implications. I am attempting to solve this aspect by using the mioco library, although due to point #1 IO is going to be a little awkward.
By "transparent integration with the runtime" you mean M:N threading. M:N threading is just delegating work to userspace that the kernel is already doing. There can be valid reasons for doing it, but M:N threading isn't us not doing the work that we could have done. In fact, we had M:N threading for a long time and went to great pains to remove it.
In addition to the downsides you mentioned, M:N threading interacts poorly with C libraries, and stack allocation becomes a major problem without a precise GC to be able to relocate stacks with.
M:N will never be as fast as an optimized async/await implementation can be, anyway. There is no way to reach nginx levels of performance with stackful coroutines.
> OS threads leads to lowest common denominator APIs (there is no way to kill a thread in Rust)
This has nothing to do with the reason why you can't kill threads in Rust. We could expose pthread_kill()/pthread_cancel() on Unix and TerminateThread() on Windows if we wanted to. The reason why you can't terminate threads that way is that there's no good reason to: if you have any locks anywhere then it's an unsafe operation.
> some difficulty in reasoning about performance implications.
I would actually expect the opposite to be true: 1:1 is easier to reason about in performance, because there are fewer magic runtime features like moving or segmented stacks involved. Could you elaborate?
In Erlang:
exit(kill).
or
exit(Pid,kill).Will kill a process. It has an isolated heap, so it won't affect other (possibly hundreds of thousands of) running processes. That memory will be garbage collected, safely and efficiently.
This will also work in Elixir, LFE and other languages running on the BEAM VM platform.
EDIT: masklinn user below pointed out correctly, the example is exit/2, that is exit(Pid,kill). In fact it is just exit(Pid, Reason), where Reason can be other exit reason, like say my_socket_failed. However in that case the process could catch it and handle that signal instead of being un-conditionally killed.
I think Erlang might be okay with it, because "this thread can fail at any time" is a core value of Erlang. But it's an exception.
This functionality is critical to being able to timeout a thread.
http://hackage.haskell.org/package/base-4.6.0.1/docs/Control...
This is used to provide the killThread function:
http://hackage.haskell.org/package/base-4.6.0.1/docs/Control...
This is more of an implementation detail you make on a case-by-case language rather than a builtin to go.
Think about the interaction with (non-memory) resource ownership. This is just horrible, and I wouldn't even want it in a higher-level language. If you want to carefully notify threads that they must terminate, set up a channel, or write to a shared variable, but please do not just forcibly terminate threads.
I can't support your argument, because I'm not capable. If I was, I probably wasn't asking in the first place.
The bit about Mutex also has a "just type this to fix the problem with no explanation of how or why it works" flavour (although I guess if you already grok mutexes then its use here might be obvious to you).
Not a criticism of the tutorial, but it does something common in Rust tutorials which is really a problem with the language at this point: Rust tutorials always spend a lot of time interpreting Rust's notoriously poor error messages (e.g. "what this message that doesn't mention Sync is trying to tell you is that you need Sync on this type"). That's great when you're doing the tutorial, but as soon as you're on your own man are those errors frustrating.
This makes it very easy to write, for instance, a concurrent in-place quicksort (this example uses the scoped-pool crate, which provides a thread pool supporting scoped threads):
extern crate scoped_pool; // scoped threads
extern crate itertools; // generic in-place partition
extern crate rand; // for choosing a random pivot
use rand::Rng;
use scoped_pool::{Pool, Scope};
pub fn quicksort<T: Send + Sync + Ord>(pool: &Pool, data: &mut [T]) {
pool.scoped(move |scoped| do_quicksort(scoped, data))
}
fn do_quicksort<'a, T: Send + Sync + Ord>(scope: &Scope<'a>, data: &'a mut [T]) {
scope.recurse(move |scope| {
if data.len() > 1 {
// Choose a random pivot.
let mut rng = rand::thread_rng();
let len = data.len();
let pivot_index = rng.gen_range(0, len); // Choose a random pivot
// Swap the pivot to the end.
data.swap(pivot_index, len - 1);
let split = {
// Retrieve the pivot.
let mut iter = data.into_iter();
let pivot = iter.next_back().unwrap();
// In-place partition the array.
itertools::partition(iter, |val| &*val <= &pivot)
};
// Swap the pivot back in at the split point by putting
// the element currently there are at the end of the slice.
data.swap(split, len - 1);
// Sort both halves (in-place!).
let (left, right) = data.split_at_mut(split);
do_quicksort(scope, left);
do_quicksort(scope, &mut right[1..]);
}
})
}
In this example, quicksort will block until the array is fully sorted, then return. for i in 0..3 {
thread::spawn(move || {
data[i] += 1;
});
}
What is the 'move' thing here before the || var closures = [];
for (var i=0;i<5;i++) {
closures.push(function() {
console.log(i);
});
}
The code above will result in incorrect results: 5, 5, 5, 5, 5. Because you're capturing `i` as a reference.To avoid this, JS devs typically do this:
closures.push((function(i) {
return function() {
console.log(i);
};
})(i));
Or, if you can afford the ES6 support: for (let i=0;i<5;i++) {
/* `let`-scoped `i`s are created individually at each iteration, so it is safe to capture them by references. */
}
Rust supports this pattern by a built-in syntax. That's what `move` means. If you prepend the `move` keyword before the closure, the variables will be captured by values, not references. for (var i = 0; i < 5; i++) {
let i_ = i;
/* use i_ in the closure */
}
In your ES6 solution if you e.g. increment "i" inside the loop after the closure the closure will see the mutation!The real cause of confusion is mutation and javascript's scoping rules.
Wouldn't it have been a better approach to add some sort of demarkation, such as i* or i^ (or whatever) to indicate this?
Just curious.
So, "SIMD concurrency" is not incorrect (although SIMD parallelism is more correct :)
// C#
int sum;
Parallel.ForEach(myCollection, item => sum += item);
Which operates using a reasonably sized thread pool rather than a thread per item. Something similar in Rust (Excuse my rusty rust) would look like. let someList = ...
parallel::for(someList.iter(), |item| {
// Do thing with each item
};
I think there is ongoing work on this topic, but it will likely only be library-level and not language-level (Just like Linq is a language level feature in C# but PLinq is a library). There are some third party crates that do this like simple_parallel.Edit: Low-level simd exist as intrinsics and of course through llvm vectorizations.
Also note how much boilerplate one has to write and how the code snippets bypass error handling (do it differently in "real" code but don't show us how). Bleh.
This prevents boiler plate issues, and allows the compiler to help you discover threading issues at compile time rather than runtime.
It's easy enough to just mark all you structs send+sync and still shoot your foot off just like in any language. The point is, you need to be explicit that your trying to shoot your foot off, as opposed to other languages which basically pull the trigger for you.
You don't have to worry about most of this. Doing concurrent things in Rust is pretty clean. Designing new concurrent abstractions from scratch is where you need to worry about Send and Sync and be careful. And it's totally worth it, entire classes of concurrency errors just go away.
The error handling can get verbose, though with the new `?` operator and `catch` syntax it's much cleaner now.
let mut data = vec![1, 2, 3];
for i in 0..3 {
thread::spawn(move || {
data[i] += 1;
});
}(You can also plug in a custom allocator which behaves differently)
Had Rust opted for exceptions, it'd be a much better, and actually usable, language. Rust's terribly error-handling strategy is the chief reason not to use it.
malloc can fail, even on default linux (overcommit enabled), if you go above the process's vmem limit for instance (because 32b or rlimited). And of course not all OS overcommit, Windows famously does not.