Also, this feature is pretty cool!
I wish I'd included more images and diagrams in the post, but I'm generally terrible at coming up with those.
I also wish we'd been able to enable it in production for everyone immediately... but with a change this big we need to be cautious.
JavaScript is spelled incorrect all over the place.
Awesome feature and post!
(I habitually type it "Javascript" but it is indeed supposed to be "JavaScript".)
(You may remember me as the tech lead of Sandstorm.io and Cap'n Proto.)
Happy to answer questions!
Would doing something like this, obviously replacing example.com with their own domain.com, replace the offending header?
addEventListener('fetch', event => {
let request = event.request;
if (request.headers.has('X-Frame-Options')) {
let newHeaders = new Headers(request.headers);
newHeaders.set('X-Frame-Options', 'ALLOW-FROM https://example.com/');
event.respondWith(fetch(request, {headers: newHeaders}));
}
// Use default behavior.
return;
});Of course, you could only apply it to your own server.
Also, you would want to think carefully about clickjacking attacks (where someone puts your site in an invisible iframe and tricks people into clicking on it). The X-Frame-Options header was probably added to prevent clickjacking.
Will Cloudflare be curating a list of useful worker scripts? I imagine there will be certain usecases that get a lot of attention (e.g. ESI)
Do requests made from the API go through Cloudflare's usual pipeline, or do they go straight to the backend? In short, will we need to manage the cache ourselves?
And finally, does this change Cloudflare's role as a "dumb" intermediary?
Better than that, we plan to unify this with Cloudflare Apps, so people can publish a reusable script as an app, and other people can then "install" that app onto their site with a click.
> Do requests made from the API go through Cloudflare's usual pipeline, or do they go straight to the backend? In short, will we need to manage the cache ourselves?
The worker sits in front of the cache, so subrequests made by calling fetch() will go through the usual caching logic.
Eventually we also plan to expose the "Cache" API from standard Service Workers to allow you to manipulate the cache directly for advanced use cases, but this shouldn't be necessary for most people.
> And finally, does this change Cloudflare's role as a "dumb" intermediary?
That sounds like a policy question, which isn't my department. ;)
I'd be curious to learn more about the implementation (did you lift the existing SW implementation from blink somehow, or reimplement it)?
I looked a bit at the code in Chrome but determined that it was too attached to Chrome infrastructure that didn't make sense in our use case. Also, there are a lot of parts of Service Workers that don't make any sense outside the browser, so it looked like it would be pretty hairy trying to pull those apart. So, we're building our own implementation (a lot of which is still in-progress, of course).
I actually built a little V8 API binding shim using template and macro metaprogramming that I really like, which allows us to write classes in natural C++ and trivially export them to Javascript. We've been filling in the various non-V8-builtin APIs using this.
We're using libkj's event loop, HTTP library, and other utility code. (KJ is the C++ framework library that is in the process of spinning out of Cap'n Proto. Yeah, I may be suffering NIH, but I think it's worked well.)
https://chromium.googlesource.com/chromium/src.git/+/lkgr/gi...
A bindings layer for v8 that was specifically intended to make implementing web-style APIs outside of Blink easier. At the time at least, refactoring things like SW out of Blink was ~impossible.
I'd love to hear more about your evaluation of Lua. LuaJIT is so blazingly fast(and small!) that I'm sure it'd be some pretty significant compute savings.
What sandbox solutions did you look into? Separate lua states, just overriding ENV/setfenv() or something completely different?
But for running third-party code, we need to everything in our power to reduce the risk of a compromise.
Every sandbox (including V8) has bugs, and security is about risk management. With scrutiny, the low-hanging fruit is found and the risk of further bugs steadily decreases. At the end of the day, no Lua sandboxing mechanism has had anywhere near the scrutiny of V8. It's a totally unfair chicken-and-egg problem: to get scrutiny you need usage, but to get usage you need scrutiny. But, it is what it is. :/
edit. In another reply you said that pricing hasn't been finalised which is understandable. We've got a few use cases which CF Workers would be ideal for but we'd be looking at 10-15k requests per minutes which could get expensive if pricing is per request.
What we'll do at some point is allow you to provide a WASM blob along-side your script, which will be loaded separately and then exposed to your script probably as a global variable.
In the blog post you talk about the trade-offs of using JavaScript vs other options like containers. I thought you might be interested in this comparison of using JavaScript vs other sandboxing options.
https://blog.acolyer.org/2017/08/22/javascript-for-extending...
> Hosts of V8 can multiplex applications by switching between contexts, just as conventional protection is implemented in the OS as process context switches… A V8 context switch is just 8.7% of the cost of a conventional process context switch… V8 will allow more tenants, and it will allow more of them to be active at a time at a lower cost.
In practice, though, a typical script probably uses much less than 1ms of CPU time per request and probably needs only a couple megs of RAM. Because we're applying limits on the level of a V8 Isolate, not on an OS process, the limits go a lot further.
Keep in mind that Cloudflare Workers are not intended to be a place where you do "bulk compute", but rather where you run little bits of code that benefit from being close to users.
Of course, that's not the answer to everything. We don't plan to offer other storage in v1 but we are definitely thinking about it for the future.
https://cloudflareworkers.com/#9bdc354e936c05a4a1d7df7eb0d7f...
Any time you commit to someone else's API -- whether it's an actual industry standard, or simply some de facto widely used paradigm -- you incur risks; conversely, now that you're a vested participant, consider being involved in the future of the spec so it can evolve where it needs to meeting emerging needs around its new uses.
That said, I am amazed by how well the spec fits as-is. I don't usually like other people's API designs but in this case I think they did a really good job, and I've been pleased not to have to think about API design myself.
johansch: All I want is my code running on your nodes all around the world with an end-to-end ping that is less than 10 ms to the average client
dsl: Akamai Edge Compute is what they are asking for
Is the lack of maturity also the reason for not choosing something like vm2 for NodeJS https://github.com/patriksimek/vm2
Looking briefly, it looks like it's based on creating separate contexts, but not separate isolates. Contexts within the same isolate can be reasonably secure (it's how Chrome sandboxes iframes from their parent frames, after all), but they still share a single heap and must run on the same thread. Isolates can run on separate threads. We prefer to give each script its own isolate, so that one script using a lot of CPU does not block other scripts. We also want to be able to kill a script that does, say, "while(true) {}".
So yeah, it looks like a neat library but it probably wouldn't suit our needs.
My experience with Service Worker APIs hasn't been very positive, although I don't have any suggestions for ways it could be improved, so I apologize for the non-constructive feedback. Maybe after using it more I'll change my mind. I recognize that everyone involved is likely working hard to provide an API that's capable of handling a wide range of problems, many of which I likely haven't even considered.
Here's a more actionable complaint: fetch doesn't support timeout or abortion. I have a hard time understanding how this isn't a problem for more people. Say what you will about XMLHttpRequest, at least it supports these basic features. As an end-user, I always find it absolutely infuriating when things hang forever because a developer forgot to handle failure cases.
I'd love it if you published a locally runnable version. Aside from making it easier to configure and experiment, it would give me peace of mind to know that I could continue to use the same configuration if Cloudflare decided to terminate my service.
https://developers.google.com/web/updates/2017/09/abortable-...
We'll be implementing soon.
I haven't looked into how specifically to expose HTTP Push in the API but that certainly seems like something we should support.
This is very, very awesome work. My team has been working on enabling the PRPL pattern[1] and differential serving on a few platforms, and edge caching has been a problem. We've tried to use Vary: User-Agent, but that leads to low cache hits. This API would let us to much smarter UA sniffing at the edge.
From there we just need to parse some responses like JS and HTML, to be able to push their sub-resources, for an instant speedup and great caching for fine-grained deployments.
[1]: https://developers.google.com/web/fundamentals/performance/p...
Is there any indication on price level ? And what about runtime duration ?
Cloudflare Workers will run in all of Cloudflare's 117 (and growing) locations. The idea is that you'd put code in a worker if you need the code to run close to the user. You might want that to improve page load speed, or to reduce bandwidth costs (don't have to pay for the long haul), or because you want to augment Cloudflare's feature set, or a number of other reasons. But, generally, you would not host your whole service on this. (Well, you could, but it's not the intent.)
We haven't nailed down pricing yet, but we've worked hard to create the most efficient possible design so that we can make pricing attractive.
To give regular end-user the rights to e.g. 'record one video of up to X seconds and Y mbps under ID "some-uuid"'
I've been waiting for a proper Content Ingestion Network for ages by now... if the CloudFlare video team ever wants to talk to someone who hand-rolled their own single-node version of this I'd be more than willing to share my experiences.
We haven't nailed down pricing yet
One of the things I like about AWS is the price doesn't jump from $0/year directly to $240/year the way cloudflare does.Seems like whenever there's co-execution (VMs, JavaScript, etc) there seem to be side channel leakages.
There is a theoretical solution that we might be able to explore at some point: If compute is deterministic -- that is, always guaranteed to produce the same result given the same input -- then it can't possibly pick up side channels. It's possible to imagine a Javascript engine that is deterministic. The fact that Javascript is single-threaded helps here. In concrete terms, this would mean hooking Date.now() so that it stays constant during continuous execution, only progressing between events.
That said, this is just a theory and there would be lots of details to work out.
Doesn't this require the timing and interleaving with other processes also be deterministic? ...which seems hard to guarantee with modern CPUs, async IO, and shared execution.
I hope it's not priced too harshly. Hopefully an added monthly flat rate rather than per-request pricing?
- Is it "Cloudflare Workers" or "Cloudflare Service Workers"?
A "Cloudflare Worker" is JavaScript you write that runs on Cloudflare's edge.
A "Cloudflare Service Worker" is specifically a worker which handles
HTTP traffic and is written against the Service Worker API.
Consufing naming convention. Now you have to say 'worker worker' or 'non-service worker' so nobody has to wonder if you meant 'service worker' when you only said 'worker'.Also note we didn't invent these terms. "Workers" and "Service Workers" are W3C standards.
if (request.method == "post") { ... }If there is - this means that there (eventually) will be a way to have logs from the edge servers. I'm just thinking about a worker that would collect the requests and responses data in some circular buffer, and try to push it to the origin server. Eventually, the data will get through, so no CDN-returned 52x ("web server is not responding" etc) errors would go unnoticed.
We're thinking about how to do writable storage, but it's tricky, since one thing we don't want is for developers to have to think about the fact that their code is running in hundreds of locations.
Render your SPA (different index.html) when a login cookie is set and otherwise render your landing page (yet another index.html)? - Such that that my http://example.com can always be cached (unless it needs to hit the server where the same logic is implemented).
And in general, how do you manage your landing page vs. your SPA?
Also, agree with other commentators here; nicely-written blog post!
PS. There's an investment fund targeting Cloudflare App developers: https://www.cloudflare.com/fund/
Also a little concerned about writing anything substantial without an onprem version!
FWIW I've seen several different folks saying they're working on implementing a Service Workers API shim on top of Node, which would be a great way to get an on-prem version, and should not be very hard to do.
>SECTION 10: LIMITATION ON NON-HTML CACHING
>You acknowledge that Cloudflare’s Service is offered as a platform to cache and serve web pages and websites and is not offered for other purposes, such as remote storage. Accordingly, you understand and agree to use the Service solely for the purpose of hosting and serving web pages as viewed through a web browser or other application and the Hypertext Markup Language (HTML) protocol or other equivalent technology. Cloudflare’s Service is also a shared web caching service, which means a number of customers’ websites are cached from the same server. To ensure that Cloudflare’s Service is reliable and available for the greatest number of users, a customer’s usage cannot adversely affect the performance of other customers’ sites. Additionally, the purpose of Cloudflare’s Service is to proxy web content, not store data. Using an account primarily as an online storage space, including the storage or caching of a disproportionate percentage of pictures, movies, audio files, or other non-HTML content, is prohibited. You further agree that if, at Cloudflare’s sole discretion, you are deemed to have violated this section, or if Cloudflare, in its sole discretion, deems it necessary due to excessive burden or potential adverse impact on Cloudflare’s systems, potential adverse impact on other users, server processing power, server memory, abuse controls, or other reasons, Cloudflare may suspend or terminate your account without notice to or liability to you.
I think in practice, unless you have a very popular website or are abusing it in some way, they probably wouldn't care or even notice. But you'd still be taking a gamble.
(I don't work at Cloudflare though so take whatever I say with a grain of salt.)
I guess the answer for the OP then is, get an enterprise account and you can do whatever you want.
They do have rules on large files though and prefer you use it for typical web content like text and images.
A more granular way to block requests than what cloudflare provides (IP, BGP AS, country)
Tokenizing credit cards at the edge if you have a payment provider that supports that, and are using CF's PCI compliant environment.
Injecting a unique ID into requests for log correlation. You get collisions if you do this via browser javascript for various reasons. I'm assuming v8 has a better Math.random(), or that you would at least be able to find a workaround since it's in one stack.
I'm sure there's more.
So yeah, if you care about milliseconds :)