Another thing one might run across is multipart/x-mixed-replace. I wrote a crate for that. [1] I didn't see a spec for it, but someone since pointed out to me that it's probably identical to multipart/mixed, and now seeing an example in the multer README it clicks that I should have looked at RFC 2046 section 5.1.1, [2] which says this:
> This section defines a common syntax for subtypes of "multipart". All subtypes of "multipart" must use this syntax.
...and written a crate general enough for all of them. Maybe I'll update my crate for that sometime. My crate currently assumes there's a Content-Length: for each part, which isn't specified there but makes sense in the context I use it. It wouldn't be hard to also support just the boundary delimiters. And then maybe add a form-data parser on top of that.
btw, the article also talks specifically about proxying the body. I don't get why they're parsing the multipart data at all. I presume they have a reason, but I don't see it explained. I'd expect that a body is a body is a body. You can stream it along, and perhaps also buffer it in case you want to support retrying the backhaul request, probably stopping the buffering at some byte limit at which you give up on the possibility of retries, because keeping arbitrarily large bodies around (in RAM or even spilling to SSD/disk) doesn't sound fun.
[1] https://crates.io/crates/multipart-stream
[2] https://datatracker.ietf.org/doc/html/rfc2046#section-5.1.1
However chrome killed support except for images https://bugs.chromium.org/p/chromium/issues/detail?id=249132
That's not true. You can stream JSON, too. You just have to do something fancier than JSON.stringify().
At least with multipart/form-data we get to avoid transfer encodings, which are also quite annoying to handle (especially as they can be nested, which is probably the worst aspect of RFC2046).
> [Not having Content-Length:] makes writing a streaming MIME parser much harder.
I don't think that's a big problem for Rust application servers where there are nice crates for efficient text searching you can plug in. Maybe more so for folks doing low-dependency and/or embedded stuff, especially in C.
But it's just dumb IMHO when you want to send arbitrary data to have to come up with a random boundary that you hope isn't in the data you're sending. With a strong random number generator you can do this to (un)reasonable statistical confidence, but that shouldn't be necessary at all.
Multipart predates http/1.0 and was written for email. It wasn't unheard of in the early days to directly enter SMTP commands. It would also be more readable on clients that didn't support mime.
The problem they have specifically would be that in a single request (form post for example) those uploads will be linear.
Solution really boils down to paralellizing the upload, using protocols/standards like https://tus.io/ or S3-compatible APIs to push the data up then syncronize with a record/document on the server.
Boundaries are a lot like UUIDs, and rely on the same logic. When generating random data, once you have enough bits, the odds are against that sequence of bits ever having been generated before in the universe.
Sure you can. I designed a system where the client uploads a multipart-form that's three parts, the first part is JSON ("meta") and the next two parts are gzipped blobs ("raw.gz" and "log.gz"). The server reads the first part which is metadata that tells it how to handle the next two parts.
I happen to be using Falcon and streaming-form-data on the server side.
heh I'm always annoyed by this unfortunate name clash: