They propose just using an async iterator of UInt8Array. I almost like this idea, but it's not quite all the way there.
They propose this:
type Stream<T> = {
next(): Promise<{ done, value: UInt8Array<T> }>
}
I propose this, which I call a stream iterator! type Stream<T> = {
next(): { done, value: T } | Promise<{ done, value: T }>
}
Obviously I'm gonna be biased, but I'm pretty sure my version is also objectively superior:- I can easily make mine from theirs
- In theirs the conceptual "stream" is defined by an iterator of iterators, meaning you need a for loop of for loops to step through it. In mine it's just one iterator and it can be consumed with one for loop.
- I'm not limited to having only streams of integers, they are
- My way, if I define a sync transform over a sync input, the whole iteration can be sync making it possible to get and use the result in sync functions. This is huge as otherwise you have to write all the code twice: once with sync iterator and for loops and once with async iterators and for await loops.
- The problem with thrashing Promises when splitting input up into words goes away. With async iterators, creating two words means creating two promises. With stream iterators if you have the data available there's no need for promises at all, you just yield it.
- Stream iterators can help you manage concurrency, which is a huge thing that async iterators cannot do. Async iterators can't do this because if they see a promise they will always wait for it. That's the same as saying "if there is any concurrency, it will always be eliminated."
> - I can easily make mine from theirs
That... doesn't make it superior? On the contrary, theirs can't be easily made out of yours, except by either returning trivial 1-byte chunks, or by arbitrary buffering. So their proposal is a superior primitive.
On the whole, I/O-oriented iterators probably should return chunks of T, otherwise you get buffer bloat for free. The readv/writev were introduced for a reason, you know.
This lines up with my thinking. The proposal should give us a building block in the form of the primitive. I would expect the grandparent comment’s API to be provided in a library built on top of a language level primitive.
Plus theirs involves the very concrete definition of an array, which might have 100 prototype methods in JS, each part of their API surface. I have one function in my API surface.
From what it looks like, they want their streams to be compatible with AsyncIterator so it'd fit into existing ecosystem of iterators.
And I believe the Uint8Array is there for matching OS streams as they tend to move batches of bytes without having knowledge about the data inside. It's probably not intended as an entirely new concept of a stream, but something that C/C++ or other language that can provide functionality for JS, can do underneath.
For example my personal pet project of a graph database written in C has observers/observables that are similar to the AsyncIterator streams (except one observable can be listened to by more than one observer) moving about batches of Uint8Array (or rather uint8_t* buffer with capacity/count), because it's one of the fastest and easiest thing to do in C.
It'd be a lot more work to use anything other than uint8_t* batches for streaming data. What I mean by that, is that any other protocol that is aware of the type information would be built on top of the streams, rather than being part of the stream protocol itself for this reason.
And yes, because it's a new abstraction the compat story is interesting. We can easily wrap any source so we'll have loads of working sources. The fight will be getting official data sinks that support a new kind of stream
While I understand the logic, that's a terrible idea.
* The overhead is massive. Now every 1KiB turns into 1024 objects. And terrible locality.
* Raw byte APIs...network, fs, etc fundamentally operate on byte arrays anyway.
In the most respectful way possible...this idea would only be appealing to someone who's not used to optimizing systems for efficiency.
Small, short-lived objects with known key ordering (monomorphism) are not a major cost in JS because the GC design is generational. The smallest, youngest generation of objects can be quickly collected with an incremental GC because the perf assumption is that most of the items in the youngest generation will be garbage. This allows collection to be optimized by first finding the live objects in the gen0 pool, copying them out, then throwing away the old gen0 pool memory and replacing it with a new chunk.
My reference point is from a noob experience with Golang - where I was losing a bunch of efficiency to channel overhead from sending millions of small items. Sending batches of ~1000 instead cut that down to a negligible amount. It is a little less ergonomic to work with (adding a nesting level to your loop).
If you go back a few versions, that number goes up to around 105x. I don’t recall now if I tested back to 14. There was an optimization to async handling in 16 that I recall breaking a few tests that depended on nextTick() behavior that stopped happening, such that the setup and execution steps started firing in the wrong order, due to a mock returning a number instead of a Promise.
I wonder if I still have that code somewhere…
you can run this to see the overhead for node.js Bun and Deno: https://gist.github.com/billywhizz/e8275a3a90504b0549de3c075...
I dabble in JS and… what?! Any idea why?
Adding types on top of that isn't a protocol concern but an application-level one.
[0] https://github.com/microsoft/TypeScript/blob/924810c077dd410...
I agree with this.
I have had to handle raw byte streams at lower levels for a lot of use-cases (usually optimization, or when developing libs for special purposes).
It is quite helpful to have the choice of how I handle the raw chunks of data that get queued up and out of the network layer to my application.
Maybe this is because I do everything from C++ to Javascript, but I feel like the abstractions of cleanly getting a stream of byte arrays is already so many steps away from actual network packet retrieval, serializing, and parsing that I am a bit baffled folks want to abstract this concern away even more than we already do.
I get it, we all have our focuses (and they're ever growing in Software these days), but maybe it's okay to still see some of the bits and bytes in our systems?
type Stream<T> = {
next(): { done, value: T } | Promise<{ done, value: T }>
}
the above can effectively be discussed as a combination of the following: type Stream<T> = {
next(): { done, value: T }
}
type Stream<T> = {
next(): Promise<{ done, value: T }>
}
You've covered the justifications for the 2nd signature, but it's a messy API. Specifically:> My way, if I define a sync transform over a sync input, the whole iteration can be sync making it possible to get and use the result in sync functions. This is huge as otherwise you have to write all the code twice: once with sync iterator and for loops and once with async iterators and for await loops.
Writing all the code twice is cleaner in every implementation scenario I can envisage. It's very rare I want generalised flexibility on an API call - that leads to a lot of confusion & ambiguity when reading/reviewing code, & also when adding to/editing code. Any repetitiveness in handling both use-cases (separately) can easily be handled with well thought-out composition.
But the bigger problem here is that sync and async aren't enough. You almost need to write everything three times: sync, async, and async-batched. And that async-batched code is gonna be gnarly and different from the other two copies and writing it in the first place and keeping it in sync is gonna give you headaches.
To see how it played out for me take a look at the difference between:
https://github.com/iter-tools/regex/blob/a35a0259bf288ccece2... https://github.com/iter-tools/regex/blob/a35a0259bf288ccece2... https://github.com/iter-tools/regex/blob/a35a0259bf288ccece2...
>> fn double(iter: $iterator<i32>) {
return *for x in iter { $yield( x * 2 )}
}
>> fn add_ten(iter: $iterator<i32>) {
return *for x in iter { $yield( x + 10 )}
}
>> fn print_all(iter: $iterator<i32>) {
for x in iter { $print( x )}
}
>> const source = *for x in [1, 2, 3] { $yield( x )}
>> source |> double |> add_ten |> print_all
12
14
16
You get backpressure for free, and the compiler can make intelligent decisions, such as automatic inlining, unrolling, kernel fusing, etc. depending on the type of iterators you're working with.They are those languages versions of goroutines, and JavaScript doesn’t have one. Generators sort of, but people don’t use them much, and they don’t compose them with each other.
So if we are going to fix Streams, an implementation that is tuned only for IO-bound workflows at the expense of transform workflows would be a lost opportunity.
To see the problem let's create a stream with feedback. Lets say we have an assembly line that produces muffins from ingredients, and the recipe says that every third muffin we produce must be mushed up and used as an ingredient for further muffins. This works OK until someone adds a final stage to the assembly line, which puts muffins in boxes of 12. Now the line gets completely stuck! It can't get a muffin to use on the start of the line because it hasn't made a full box of muffins yet, and it can't make a full box of muffins because it's starved for ingredients after 3.
If we're mandated to clump the items together we're implicitly assuming that there's no feedback, yet there's also no reason that feedback shouldn't be a first-class ability of streams.
If you're dealing with small objects at the production side, like individual tag names, attributes, bindings, etc. during SSR., the natural thing to do is to just write() each string. But then you see that performance is terrible compared to sync iterables, and you face a choice:
1. Buffer to produce larger chunks and less stack switching. This is the exact same thing you need to do with Streams. or
2. Use sync iterables and forgo being able to support async components.
The article proposes sync streams to get around this some, but the problem is that in any traversal of data where some of the data might trigger an async operation, you don't necessarily know ahead of time if you need a sync or async stream or not. It's when you hit an async component that you need it. What you really want is a way for only the data that needs it to be async.We faced this problem in Lit-SSR and our solution was to move to sync iterables that can contain thunks. If the producer needs to do something async it sends a thunk, and if the consumer receives a thunk it must call and await the thunk before getting the next value. If the consumer doesn't even support async values (like in a sync renderToString() context) then it can throw if it receives one.
This produced a 12-18x speedup in SSR benchmarks over components extracted from a real-world website.
I don't think a Streams API could adopt such a fragile contract (ie, you call next() too soon it will break), but having some kind of way where a consumer can pull as many values as possible in one microtask and then await only if an async value is encountered would be really valuable, IMO. Something like `write()` and `writeAsync()`.
The sad thing here is that generators are really the right shape for a lot of these streaming APIs that work over tree-like data, but generators are far too slow.
Also I'm curious why you say that generators are far too slow. Were you using async generators perhaps? Here's what I cooked up using sync generators: https://github.com/bablr-lang/stream-iterator/blob/trunk/lib...
This is the magic bit:
return step.value.then((value) => {
return this.next(value);
}); type Stream<T> = {
next(): { done, value: T } | Promise<{ done, value: T }>
}
Where T=Uint8Array. Sync where possible, async where not.Engineers had a collective freak out panic back in 2013 over Do not unleash Zalgo, a worry about using callbacks with different activation patterns. Theres wisdom there, for callbacks especially; it's confusing if sometime the callback fires right away, sometimes is in fact async. https://blog.izs.me/2013/08/designing-apis-for-asynchrony/
And this sort of narrow specific control has been with us since. It's generally not cool to use MaybeAsync<T> = T | Promise<T>, for similar "it's better to be uniform" reasons. We've been so afraid of Zalgo for so long now.
That fear just seems so overblown and it feels like it hurts us so much that we can't do nice fast things. And go async when we need to.
Regarding the pulling multiple, it really depends doesn't it? It wouldn't be hard to make a utility function that lets you pull as many as you want queueing deferrables, allowing one at a time to flow. But I suspect at least some stream sources would be just fine yielding multiple results without waiting. They can internally wait for the previous promise, use that as a cursor.
I wasn't aware that generators were far too slow. It feels like we are using the main bit of the generator interface here, which is good enough.
I was so sick of being slapped around by LJHarb who claimed to me again and again that TC39 was honoring the Zalgo post (by slapping synthetic deferrals on everything) that I actually got Isaacs to join the forum and set him straight: https://es.discourse.group/t/for-await-of/2452/5
The async iterable approach makes so much more sense because it composes naturally with for-await-of and plays well with the rest of the async/await ecosystem. The current Web Streams API has this weird impedance mismatch where you end up wrapping everything in transform streams just to apply a simple operation.
Node's original stream implementation had problems too, but at least `.pipe()` was intuitive. You could chain operations and reason about backpressure without reading a spec. The Web Streams spec feels like it was written by the kind of person who thinks the solution to a complex problem is always more abstraction.
import { Repeater } from "@repeaterjs/repeater";
const keys = new Repeater(async (push, stop) => {
const listener = (ev) => {
if (ev.key === "Escape") {
stop();
} else {
push(ev.key);
}
};
window.addEventListener("keyup", listener);
await stop;
window.removeEventListener("keyup", listener);
});
const konami = ["ArrowUp", "ArrowUp", "ArrowDown", "ArrowDown", "ArrowLeft", "ArrowRight", "ArrowLeft", "ArrowRight", "b", "a"];
(async function() {
let i = 0;
for await (const key of keys) {
if (key === konami[i]) {
i++;
} else {
i = 0;
}
if (i >= konami.length) {
console.log("KONAMI!!!");
break; // removes the keyup listener
}
}
})();
https://github.com/repeaterjs/repeaterIt’s one of those abstractions that’s feature complete and stable, and looking at NPM it’s apparently getting 6.5mil+ downloads a week for some reason.
Lately I’ve just taken the opposite view of the author, which is that we should just use streams, especially with how embedded they are in the `fetch` proposals and whatever. But the tee critique is devastating, so maybe the author is right. It’s exciting to see people are still thinking about this. I do think async iterables as the default abstraction is the way to go.
edit: I found where stop is created[1]. I can't say I've seen this pattern before, and the traditionalist in me wants to dislike the API for contradicting conventions, but I'm wondering if this was designed carefully for ergonomic benefits that outweigh the cost of violating conventions. Or if this was just toy code to try out new patterns, which is totally legit also
[1]: https://github.com/repeaterjs/repeater/blob/638a53f2729f5197...
let resolveRef;
const promise = new Promise((res) => { resolveRef = res; });
const callback = (data) => {
// Do work...
resolveRef(data); // This "triggers" the await
};
Object.assign(callback, promise);
There’s a real performance cost to awaiting a fake Promise though, like `await regularPromise` bypasses the actual thenable stuff. await document.fonts.ready
device.lost.then(() => {
console.log('WebGPU device lost :(')
})
I feel like this isn't confusing if you know how promises work, but maybe it can be confusing for someone coming from Python/Rust, where async functions don't evaluate until their futures are awaited.- Java went through it with java.util.stream (pull-based, lazy) vs Reactive Streams/Project Reactor (push-based, backpressure-aware). The result was two completely separate APIs that don't compose well.
- .NET actually handled this better with IAsyncEnumerable<T> in C# 8 — a single abstraction that's pull-based but async-aware. It composes naturally with LINQ and doesn't require a separate reactive library for most use cases.
- Go side-stepped the problem entirely with goroutines and channels, making the whole streams abstraction unnecessary for most cases.
What I find interesting about this proposal is it's trying to learn from that prior art. The biggest mistake Java made was bolting async streams on top of a synchronous abstraction and then needing a completely separate spec (Reactive Streams) for the async case. If JavaScript can get a single unified abstraction that handles both sync iteration and async backpressure, that would be a genuine improvement over what exists in most other runtimes.
That's an inherent flaw of garbage collected languages. Requiring to explicitly close a resource feels like writing C. Otherwise you have a memory leak or resource exhaustion, because the garbage collector may or may not free the resource. Even C++ is better at this, because it does reference counting instead.
I'm working on a db driver that uses it by convention as part of connection/pool usage cleanup.
In an ideal world you could just ask the host to stream 100MB of stuff into a byte array or slice of the wasm heap. Alas.
for await (const chunk of stream) {
// process the chunk
stream.returnChunk(chunk);
}
This would be entirely optional. If you don’t return the chunk and instead let GC free it, you get the normal behavior. If you do return it, then the stream is permitted to return it again later.(Lately I’ve been thinking that a really nice stream or receive API would return an object with a linear type so that you must consume it and possibly even return it. This would make it impossible to write code where task cancellation causes you to lose received data. Sadly, mainstream languages can’t do this directly.)
https://github.com/ralusek/streamie
allows you to do things like
infiniteRecords
.map(item => doSomeAsyncThing(item), { concurrency: 5 });
And then because I found that I often want to switch between batching items vs dealing with single items: infiniteRecords
.map(item => doSomeAsyncSingularThing(item), { concurrency: 5 })
.map(groupOf10 => doSomeBatchThing(groupsOf10), { batchSize: 10 })
// Can flatten back to single items
.map(item => backToSingleItem(item), { flatten: true });I tried several implementations, tweaked settings, but ultimately couldn't get around it. In some cases I had bizarre drops in activity when the consumer was below capacity.
It could have been related to the other issue they mention, which is the cost of using promises. My streams were initiating HEAPS of promises. The cost is immense when you're operating on a ton of data.
Eventually I had to implement some complex logic to accomplish batching to reduce the number of promises, then figure out some clever concurrency strategies to manage backpressure more manually. It worked well.
Once I was happy with what I had, I ported it from Deno to Go and the result was so stunningly different. The performance improvement was several orders of magnitude.
I also built my custom/native solution using the Effect library, and although some people claim it's inefficient and slow, it out-performed mine by something like 15% off the shelf, with no fine-tuning or clever ideas. I wished I'd used it from the start.
The difference is likely in that it uses a fiber-based model rather than promises at the execution layer, but I'm not sure.
I like the idea of the more ergonomic, faster api in new-stream with no buffering except at Stream.push(). NodeJS and web streams put infinitely expandable queues at every ReadableStream and WritableStream so that you can synchronously res.write(chunk) as much as you want with abandon. This API basically forces you to use generators that yield instead of synchronously writing chunks.
I suspect the benchmarks, if not most of this project, suffer from poor quality control on vibecoded implementations.
Fwiw the original Streams API could have been simpler even without async iterators.
interface Stream<T> {
// Return false from the callback to stop early.
// Result is if the stream was completed.
forEach(callback: (chunk: T) => Promise<boolean | undefined>): Promise<boolean>
}
Similarly adding a recycleBuffer(chunk) method would have gone a long way towards BYOB without all the ceremony.If we're optimizing allocations we can also avoid all the {done,value} records and return a semaphore value for the end in the proposed API.
(Web) API design is really difficult and without a voice in the room pushing really hard on ergonomics and simplicity it's easy to solve all the use cases but end up with lots of awkward corners and costs later.
Coalgebras might seem too academic but so were monads at some point and now they are everywhere.
Here's the WritableConsumableStream module:
https://github.com/SocketCluster/writable-consumable-stream
SocketCluster solves the problem of maintaining message order with async processing.
This feature is even more useful now with LLMs as you can process data live, transform streams with AI with no risk of mangling the message order.
I may have been the first person to use a for-await-of loop in this way with backpressure. At least on an open source project.
the idea is basically just use functions. no classes and very little statefulness
https://github.com/juliantcook/fluent-async-iterator
I had hoped we would have a better API by now.
It was also very useful for CLI tools utilising unix pipes.
The objection is
> The Web streams spec requires promise creation at numerous points — often in hot paths and often invisible to users. Each read() call doesn't just return a promise; internally, the implementation creates additional promises for queue management, pull() coordination, and backpressure signaling.
But that's 95% manageable by altering buffer sizes.
And as for that last 5%....what are you doing with JS to begin with?
When ReadableStream behaves the same in the browser, Workers, and other runtimes, stream-based code becomes portable and predictable. That reduces subtle backpressure bugs and eliminates “works here but not there” edge cases.
Standardization at the streams layer is a big deal for building reliable streaming systems across environments.
We have run into many problems with web streams over the years and solving them has always proven to be hairy, including the unbounded memory growth from response.clone().
The Deno team implemented a stream API inspired by Go, which I was happy with, until they ultimately acquiesced to web streams.
This proposal shares some of those principles as well.
Ideally splitting out the use cases would allow both implementations to be simpler, but that ship has probably sailed.
At a native level (C++/rust), a Promise is just a closure added to a list of callbacks for the event loop. Yes, if you did 1 per streamed byte then it would be huge but if you're doing 1 promise per megabyte, (1000 per gig), it really shouldn't add up 1% of perf.
So if you're going to flatten everything into one stream then you can't have a for loop implementation that defensively awaits on every step, or else it'll be slooooooooow. That's my proposal for the change to the language is a syntax like
for await? (value of stream) {
}
which would only do the expensive high-level await when the underlying protocol forced it to by returning a promise-valued step. const buffer = new UInt8Array(256)
const bytesRead = await reader.read(buffer)
if (bytesRead === 0) {
// Done
return
}> I'm not here to disparage the work that came before — I'm here to start a conversation about what can potentially come next.
Terrible LLM-slop style. Is Mr Snell letting an LLM write the article for him or has he just appropriated the style?
Just ctrl-f'ing through previous public posts, I think there were a total of 7 used across about that many posts. This one for example had 57. I'm not good enough in proper English to know what the normal number is supposed to be, just pointing that out.
It doesn't really matter what tools are used if the result is good
You might say well, it's on the Cloudflare blog so it must have some merit, but after the Matrix incident...
I’ve read my fair share of LLM slop. This doesn’t qualify.
Sadly it will never happen. WebAssembly failed to keep some of its promises here.
classic case of not using an await before your promise
But Observables really do not solve the problems being talked about in this post.
[1] https://github.com/WICG/observable [2] https://github.com/WICG/observable/issues/216
This is what UDP is for. Everything actually has to be async all the way down and since it’s not, we’ll just completely reimplement the OS and network on top of itself and hey maybe when we’re done with that we can do it a third time to have the cloud of clouds.
The entire stack we’re using right down to the hardware is not fit for purpose and we’re burning our talent and money building these ever more brittle towering abstractions.
A stream API can layer over UDP as well (reading in order of arrival with packet level framing), but such a stream would a bit weird and incompatible with many stream consumers (e.g. [de]compression). A UDP API is simpler and more naturally event (packet) oriented. The concepts don’t mix well.
Still, it would be nice if they browser supported a UDP API instead of the weird and heavy DTLS and QUIC immitations.