Comments (4)
I appreciate you taking the time to feedback 4ms should be fine and yes it is untrusted code. I'll go the isolate per tenant route. It was a bit of a blunder jumping the gun to go for what I thought was the more optimised way of doing it before benchmarking, rookie mistake!
Appreciate your time and guidance on the matter!
from rusty_v8.
- Am I just using the wrong APIs?
It's the correct API.
- Can you create and re-use a snapshot?
Yes, in Deno itself we use the same snapshot for main worker and multiple worker - you can see it defined here:
https://github.com/denoland/deno/blob/76df7d7c9bb7b6b552fd33efbedb28e21969d46c/cli/worker.rs#L609
I suggest to see https://github.com/denoland/deno_core for more details on how we use snapshot.
- Maybe a snapshot isn't the right thing to be using, the goal is to compile the scripts once per tenant/service_id and later execute the compile scripts with each execution getting its own context...how do I achieve that with the rusty_v8 API?
Most likely it's not the right approach. If the scripts are small you are gonna be paying a huge overhead for snapshots - snapshots effectively include the whole JS environment with all of JS builtins. So for each snapshot you're gonna be storing this duplicated data. Additionally - snapshots are only valid for particular version of V8 and particular arch. Once you bump rusty_v8
dependency you will have to regenerate all the snapshots. You want might to use v8::CacheData
instead.
from rusty_v8.
Ahhh...hmm. I may have gone off course after reading the V8 docs.
In https://v8.dev/docs/embed#contexts it says
In terms of CPU time and memory, it might seem an expensive operation to create a new execution context given the number of built-in objects that must be built. However, V8’s extensive caching ensures that, while the first context you create is somewhat expensive, subsequent contexts are much cheaper. This is because the first context needs to create the built-in objects and parse the built-in JavaScript code while subsequent contexts only have to create the built-in objects for their context. With the V8 snapshot feature (activated with build option snapshot=yes, which is the default) the time spent creating the first context will be highly optimized as a snapshot includes a serialized heap which contains already compiled code for the built-in JavaScript code. Along with garbage collection, V8’s extensive caching is also key to V8’s performance.
So I assumed the first context will be highly optimized as a snapshot
was the same snapshot I saw in the rusty_v8 API.
My thinking was that taking the snapshot then running the same script repeatedly with it as a base would achieve the optimisation the V8 docs mention.
Another issue/point is that I'm not storing the snapshot long term. The plan was to keep it in-memory, we'd have at most a few hundred snapshots on one server at a time and they would be evicted every few hours, maybe half a day. It's okay for an occasional request to be slower whilst all the scripts are recompiled but subsequent invocations until the next cache eviction would be faster/optimised.
Maybe a bit of guidance if possible considering we're in a scenario where we have a set of scripts, A and B.
- A is from one ten and and B is from another.
- We can't have them changing JS globals and affecting each other
- We want to track resource usage for A and B
- Once A and B are deployed, they won't change for some time so we can cache them
- Once deployed, A and B will be executed thousands of times per second
- A and B can vary quite wildly in size as we're embedding V8 for our customers to use, it is anyone's guess what the min,avg, max will be
From what you've said, I'm thinking a flow that follows these steps:
let source = Source::new(script, Some(&origin));
//let source = Source::new_with_cached_data(); //later when executing the script again, we'd use this with the CachedData
let compiled=v8::script_compiler::compile(&mut scope,source,v8::script_compiler::CompileOptions::ConsumeCodeCache,NoCacheReason::BecauseV8Extension).unwrap();
let unbounded_script =
compiled.get_unbound_script(&mut scope);
let cache_scripts =
unbounded_script.create_code_cache();
I could only find this regarding the ConsumeCodeCache
so presuming that's right. Seems to conflict with NoCacheReason
though...
I'm seeing deno_core
does something similar
Appreciate I've dropped a lot of info. trying to provide you all you may need
from rusty_v8.
So I assumed the first context will be highly optimized as a snapshot was the same snapshot I saw in the rusty_v8 API.
My thinking was that taking the snapshot then running the same script repeatedly with it as a base would achieve the optimisation the V8 docs mention.
Not really, from what I can tell you want to create a snapshot of the "base environment" - essentially built-in JS APIs plus any APIs you might want to provide to all tenants. Then you run from that snapshot and execute user code.
Another issue/point is that I'm not storing the snapshot long term. The plan was to keep it in-memory, we'd have at most a few hundred snapshots on one server at a time and they would be evicted every few hours, maybe half a day. It's okay for an occasional request to be slower whilst all the scripts are recompiled but subsequent invocations until the next cache eviction would be faster/optimised.
Yeah, feels to me that snapshots would be an overkill for that, and they're not very user-friendly to use. I would suggest approach with base snapshot.
Maybe a bit of guidance if possible considering we're in a scenario where we have a set of scripts, A and B.
A is from one ten and and B is from another.
We can't have them changing JS globals and affecting each other
We want to track resource usage for A and B
Once A and B are deployed, they won't change for some time so we can cache them
Once deployed, A and B will be executed thousands of times per second
A and B can vary quite wildly in size as we're embedding V8 for our customers to use, it is anyone's guess what the min,avg, max will be
If you are running untrusted code from multiple tenants the very bare minimum you should do is create a separate Isolate
instance for each tenant. Better yet, run them in separate processes. While V8 is a good sandbox, it's not bulletproof and multiple levels of security should be applied. You will also find that it's way easier to think in terms of isolates, than trying to juggle multiple snapshots and contexts.
If you are fine with the first request being a bit slower then creating a new isolate and executing tenants code will be enough for you - the baseline overhead that we measured in deno_core
for an empty Isolate is about 4ms (ie. creating a process, creating isolate from a snapshot, running empty file, doing cleanup and exiting process). In deno
itself it's about 15ms. If you are fine with such latency then I strongly recommend this approach.
You can then add the CodeCache into the mix and cache its output to make it cheaper to create new isolate with the same tenant code.
I could only find this regarding the ConsumeCodeCache so presuming that's right. Seems to conflict with NoCacheReason though...
I'm seeing deno_core does something similar
Yes, you can copy the deno_core
approach for easier integration.
from rusty_v8.
Related Issues (20)
- Add v8::MicrotaskQueue bindings
- `v8::MessageCallback` registered but not called
- Add "v8::Object::get_real_named_property_*" methods
- Missing v8::MicrotaskQueue::new() binding
- Missing v8::Context::new() options
- ICU Error in DateTimePatternGeneratorCache::CreateGenerator HOT 1
- [Feature request] Support for Linux PowerPC64 Little Endian
- V8 Sandbox feature HOT 4
- ASAN failure: context_from_object_template HOT 2
- [Bug Report]: Static assertion failed while compiling to target_arch = arm
- [question] How do scopes work? HOT 3
- Unsoundness when starting an isolate per thread HOT 6
- ci: x86_64 build failed on macos-latest image HOT 2
- `v8::String::new_from_onebyte_const` crash on Android aarch64 HOT 5
- Failed to build V8 on riscv64
- Use of deprecated APIs in binding.cc
- `clear_kept_objects` is flaky when run in parallel HOT 1
- Property setter and definer should not return "non-hole" values HOT 2
- Catch Reached heap limit crashes? HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from rusty_v8.