Let me bring you back to 2017: https://blog.cloudflare.com/introducing-cloudflar...

styren · on Sept 27, 2022

What exactly are the obstacles prohibiting multiple tenants from running rust/java/c++ on the edge?

0x457 · on Sept 27, 2022

Many reasons.

First, all 3 allow far too much: uncontrolled file system access, uncontrolled network access.

Second, customer's code cannot be trusted. Running unknown code on your server leaves with huge attack surface.

Third, isolation to deal with first two is expensive: resources and cold-start.

Third+, the point of running code on edge is for it to be quick. No one will use it if your cold-start time is higher network round-trip, and you're not going to get paid a lot if you have to run customer code even if not in use.

Fourth, running WASM allows unified runtime without any dependencies:

- Java, which JVM do you target? What language version do you target? How often you upgrade? Do you run one JVM per tenant? How do you deal with cold-start times?

- Rust/C++ which architecture your servers use? What if you want to switch to ARM? What if it's mixed? Which version of ARM it is? Can I use AVX-512?

- WASM: WASM is WASM. You don't care if customer compiled rust or C++ or Go to get that wasm.

sandGorgon · on Sept 28, 2022

on the java stuff - why not something like graalvm or https://github.com/landlord/landlord ? It does seem a lot of investment has happened in Java around this very problem. JVM bytecode has been relatively stable over decades.

Just wondering if it makes sense to make that investment. Some of these problems u will encounter when WASM gains a GC anyway. This is one of the big reasons why Graal doesnt target wasm backend. So this - one runtime/gc vs multiple is something that is a problem ull have to solve anyways.

0x457 · on Sept 28, 2022

Well, landlord is unknown to me, but it has all caps "EXPERIMENTAL" and had[1] 3 people working on it, so I wouldn't use it in production.

GraalVM is interesting, and it has isolates and from my memory it has plenty of sandboxing features like limiting network and file system access. However, it has some big downsides: it's Oracle, people still associate it with your regular JVM and all its downsides. Yeah, it can match v8 in performance if you let it warm up (it takes longer for it to warm up compared to V8.) and it's even further from what CF does in terms of compatibility with existing ecosystem.

CloudFlare workers have single digit cold-start time[2]. GraalVM is...200ms and that is considered blazing fast by JVM standards. Takes longer and slower at runtime and probably more memory consumed.

Remember, we're not talking about long-running services, we're talking about short-lived request handlers in JavaScript that run on edge. Pricing for workers is per request. It's in CF interests to serve customers requests as fast as possible. Also remember that customers want to write JavaScript.

Why is Oracle an issue? GraalVM is under GPLv2 which does not cover patents. Some performance and security features locked in Enterprise version. In addition, first production-ready version was in 2019, CF already launched workers by then.

[1]: Last commit 2018. I doubt this is a feature complete software...

[2]: https://blog.cloudflare.com/eliminating-cold-starts-with-clo...

kentonv · on Sept 28, 2022

There's a bit of a catch 22 problem with these other language runtimes in that because they haven't been battle-tested in security-hostile environments, it's harder to choose them for security-critical use cases. But that in turn means they'll never be battle-tested.

The original JVM was intended to provide secure sandboxing for "applets" but that use case failed in the market, and instead JVM became focused on use cases where isolation isn't important. Years later as security research became stronger, all sorts of holes were found in the JVM's sandbox. Presumably those holes were fixed, but if we take JVM and use it in a security-critical environment again, probably a bunch of new holes get found pretty quick?

(I don't know anything about Graal specifically, I guess it's an alternative JVM? But how much security research has it had?)

Whereas V8 has had constant security research and real attacks over the course of about 15 years. It's certainly not bullet-proof either but we have a pretty good idea of just how much of a risk it is.

Again, catch 22, there's seemingly no path forward for these VMs to prove themselves. I don't know what to do about that, but that's where we find ourselves.

kentonv · on Sept 27, 2022

You have to use heavier isolation mechanisms with those languages -- processes, containers, virtual machines. These are too heavy to support thousands of tenants per machine. Edge locations are typically smaller clusters, and you typically want all your customers to present in every point of presence, so you need to pack them tighter. V8 isolates are light enough for this to work and be cost-effective.