mithril-security / blindai Goto Github PK
View Code? Open in Web Editor NEWConfidential AI deployment with secure enclaves :lock:
Home Page: https://www.mithrilsecurity.io/
License: Apache License 2.0
Confidential AI deployment with secure enclaves :lock:
Home Page: https://www.mithrilsecurity.io/
License: Apache License 2.0
The client.run_model function passes the model id and input tensor batch as positional arguments, but does so incorrectly as the first two arguments of said function are the model id and the model hash. Hence there is a need to explicitly mention that the input tensor object is passed to the input_tensors argument.
I get this error when loading the pretrained whisper model you provide in the Colab example notebook:
import onnx
onnx_model = onnx.load("./whisper_tiny_en_20_tokens.onnx")
Stacktrace:
---------------------------------------------------------------------------
DecodeError Traceback (most recent call last)
[<ipython-input-8-d62064a4cf0f>](https://localhost:8080/#) in <module>
1 import onnx
2
----> 3 onnx_model = onnx.load("./whisper_tiny_en_20_tokens.onnx")
2 frames
[/usr/local/lib/python3.8/dist-packages/onnx/__init__.py](https://localhost:8080/#) in _deserialize(s, proto)
106 )
107
--> 108 decoded = cast(Optional[int], proto.ParseFromString(s))
109 if decoded is not None and decoded != len(s):
110 raise google.protobuf.message.DecodeError(
DecodeError: Error parsing message with type 'onnx.ModelProto'
I am going through the whisper BlindAI example in examples/whisper/BlindAI_Whisper.ipynb
and one of the classes used -- NNDecodingTask
-- is not defined in the notebook. Can the definition of this class be added so that it will be possible to export different Whisper models (small, large, multi-language etc)?
Great tutorial that can't be followed 100%...
The client API reference should be automatically generated from docstrings in the package files.
It's not practicle to rewrite it everytime we do changes, it can also have a better look this way.
The existing toole may generate the documentatoin in form of HTML pages => A website, should we be restricted to the once who produce markdown files instead? So that they can always be included directly in the gitbook, or we can have a website that will be linked there?
Here is a the advice we currently give (in french):
Notre solution nécessite l’utilisation de la technologie Intel SGX. Celle-ci est disponible sur la plupart des processeurs Intel actuels
(https://ark.intel.com/content/www/us/en/ark/search/featurefilter.html?productType=873&2_SoftwareGuardExtensions=Yes%20with%20Intel%C2%AE%20ME), ainsi que ceux de dernière génération
(https://ark.intel.com/content/www/us/en/ark/search/featurefilter.html?productType=873&2_SoftwareGuardExtensions=Yes%20with%20Intel%C2%AE%20SPS).
Dans le cadre de notre preuve de concept, n’importe quel processeur supportant la fonctionnalité SGX pourra être utilisé pour déployer notre solution.
Néanmoins nous suggérons l’utilisation de processeurs supportant le système d’attestation DCAP, qui est plus adapté aux besoins du scénario d’analyse de vidéo, en plus d’être le mode prédominant sur la dernière génération de processeurs Ice Lake, qui semblent être la gamme privilégiée pour un déploiement en pratique.
L’utilisation de processeurs supportant seulement le système d’attestation EPID est aussi possible par notre solution, mais serait moins représentative du workflow utilisé en production.
Pour un démarrage rapide pour lancer les premiers tests, l’utilisation d’Intel NUC peut être pertinente étant donné leur faible coût et leur facilité d’utilisation. Le modèle Intel® NUC Kit NUC7PJYH ($199) avec Intel® Pentium® Silver J5005 Processor peut être intéressant pour commencer.
Pour un test plus représentatif avec les derniers processeurs Ice Lake, nous recommandons le modèle Intel® Xeon® Gold 5318S ($1667) avec 512 GB de mémoire disponible pour les enclaves.
We need to make a proper english documentation page for it.
None
Merge hardware and software in notebook examples
Something like
client = BlindAiClient()
# Comment this line for hardware mode
client.connect_server(addr="localhost", simulation=True)
# Comment this line for simulation mode
client.connect_server(
addr="localhost",
policy="policy.toml",
certificate="host_server.pem"
)
Make the notebooks clearer, and less redundant
None
The command docker run --network host mithrilsecuritysas/blindai-client-demo
does not work on WSL (tested on matthias' computer)
It runs fine but the notebook cannot be accessed on the browser.
This is probably due to --network host
sharing the network with WSL and not Windows entirely, but I'm not sure, I don't know how WSL works.
I don't know how to fix it, and I don't have any windows machine on hand.
Also, we were talking with Daniel about packaging the server in the blindai-client-demo docker image directly. This could be awesome for getting started with the project, and that would allow us to sidestep this issue entirely.
Move telemetry to client-side, so that it is more useful.
Do not forget to change the documentation/readme/everywhere
N/A
None
BlindAI will provide managed AI APIs.
For transparency it would be good to register on the Client Python SDK information about each model we use behind the scenes, for instance a link to the build process that was used to serve a specific model.
For instance, we could have something like:
import blindai
card = blindai.api.get_model_card("whisper", tee="sgx")
card.model_hash
>> "77af778b51abd4a3c51c5ddd97204a9c3ae614ebccb75a606c3b6865aed6744e"
card.build_process_link
>> "github.com/..."
Not top prio but could be cool for transparency.
Current usage may leak an open socket if the user forgets to call close_connection
:
client = BlindAiClient()
client.connect_server(addr="localhost", simulation=True)
# do something with client...
client.close_connection()
This is not an issue right now since our users are mostly testing the app, making jupyter notebooks and not actual production usage. As the project matures, this may become an issue.
There is a way in python to make APIs that work like this:
with client = BlindAiClient.connect_server(addr="localhost", simulation=True):
# do something with client...
client.run_model("aaa")
# implicitely close the connection when exiting the scope
using special __enter__
and __exit__
functions, iirc
What do you think? Is this a better API surface?
This should be backward compatible with the current API.
unit tests
None
The upload_model
request will return a UUID for the model.
This UUID will be used by run_model
and will be added to the signed response proofs.
We may want to be backward compatible with the way our docs and blogposts are written, so that if the model name is not provided, you just use the last one that was uploaded.
Right now, once you upload a model, you won't be able to upload another one without discarding the first.
State the different existing features that will be broken/affected by this new feature.
Server and client tests + unittests
None
Python client should accept numpy / torch tensors directly.
This would be a much better API.
add unit tests
We should do it in a way that does not require users to install torch nor numpy, if they are not using this feature.
When building in release mode for hardware mode, we should generate a policy.toml
file that does not allow SGX debug mode.
This probably requires changes to the rust code in order to launch the enclave in non-debug mode.
The hardware docker image we publish on dockerhub has no reason to have SGX debug mode on.
We should add a build option / environment variable to generate allow-debug policy files, for dev purposes.
Either
This is a good opportunity to add the following tests:
allowDebug = false
in policy.tomlnone
In depth info about execution plans
This is more of a meta-issue (/roadmap) focusing everything about execution times.
Plans I have in mind:
I am not sure whether all of this is overkill or not since we're just using tract and not really touching the perf sensitive parts. We'll see.
None
On multiple occasions, during the function call of blindai.client.connect, the parameter mentioned is hazmat_http_on_untrusted_port, but the parameter should be hazmat_http_on_unattested_port, as per the docs at https://blindai.mithrilsecurity.io/en/latest/blindai/core.html
When first connected to the server, the client should request from the server its version, verify if it is supported by the client SDK.
In case the version is not supported, the client should reject the connection and request to be updated.
There are some Cargo.lock
files that still use SSH to pull in dependencies
One example of it is server/inference-server/network/sgx/rpc/Cargo.lock
We should change them to use HTTPS
For the Signed Responses feature (#13) to be useful, we need a way to export and validate execution proofs.
Here is the proposed API:
response = client.run_model(run_inputs, sign=True)
response.save_to_file("./execution_proof.json")
from blindai.client import load_execution_file
response = load_execution_file("./proof_of_execution.json")
response.validate(policy_file="./policy.toml") # throws if invalid or execution is not signed
print("The proof is valid!")
We should also have these functions:
response.export_binary() :: bytes
from blindai.client import load_execution_binary
response = load_execution_binary(a :: bytes)
save_to_file
work when sign=False?For the Signed Responses feature (#13) to be useful, we need a way to export execution proofs.
Signed Responses feature (#13)
Add tests and unit-tests.
None
Hi,
I encountered some problems while implementing your framework in hardware with the CovidNet example that you provide.
Do you think of anything I forgot?
I have an error that I cannot explain during the step of connecting the client to the server.
The issue is triggered by the following code : https://github.com/mithril-security/blindai-preview/blob/main/runner/remote_attestation_sgx/src/quote_verification_collateral.rs#L246
// Retrieving verification collateral using QPL
let mut p_quote_collateral: *mut sgx_ql_qve_collateral_t = ptr::null_mut();
let qv_ret = unsafe {
sgx_ql_get_quote_verification_collateral(
fmspc.as_ptr(),
fmspc.len() as u16,
ca_from_quote.as_ptr(),
&mut p_quote_collateral as *mut *mut sgx_ql_qve_collateral_t,
)
};
ensure!(
qv_ret == Quote3Error::Success,
"sgx_ql_get_quote_verification_collateral failed!"
);
This code will usually work correctly, but it is broken. We discovered the issue when trying to debug a failure from sgx_ql_get_quote_verification_collateral. While debugging, we added the following before the ensure!
statement to print the error code from the QPL.
println!("sgx_ql_get_quote_verification_collateral returned {:?}", qv_ret);
Quite surprisingly we got sgx_ql_get_quote_verification_collateral returned Quote3Error::Success
despite the fact that qv_ret != Quote3Error::Success when executing the ensure!
... To compound the mystery, the issue disappeared when compiling in debug mode, the debug builds simply printed a status different from Quote3Error::Success, (yet it was still the wrong status).
This kind of strange behavior are often the result of Undefined Behavior. And this is also the case here. The UB is actually due to how we declared the FFI interface with the QPL (C-library) in our rust code :
extern "C" {
pub fn sgx_ql_get_quote_verification_collateral(
fmspc: *const u8,
fmspc_size: u16,
pck_ra: *const c_char,
pp_quote_collateral: *mut *mut sgx_ql_qve_collateral_t,
) -> Quote3Error;
pub fn sgx_ql_free_quote_verification_collateral(
p_quote_collateral: *const sgx_ql_qve_collateral_t,
) -> Quote3Error;
}
The return type of the sgx_ql_get_quote_verification_collateral
is declared to be a Quote3Error which is a Rust enum. But a Rust enum is assumed to only take one of the declared values (it cannot host any int8 like what is often done in a C enum). In our case the UB happened when the QPL returned an enum value that could not be represented with the Rust enum.
For more information about this mismatch between Rust and C-like enum : https://mdaverde.com/posts/rust-bindgen-enum/
What should we do to fix it ?
The best course of action would be to replace our custom FFI interface declaration with an FFI declaration generated by rust-bindgen. This would avoid this kind of mistake (and also would ensure that the function signature matches) We should also look if there is already a crate on crates.io which does already that.
Security impact : No (outside of enclave).
Priority : Low (only impacts the error path)
Implement signed server responses.
The server responds to the client with a signed response, which the client could store to attest to someone else that the response was indeed emitted from the trusted enclave.
This may be made optional so that clients that do not use it don't have to pay for it.
Currently, the ModelDatumType is defined in both the server and client sides.
Adding it to the API (securedexchange.proto) should eliminate this duplication and make the extension of supported data types easier.
Currently, the ModelDatumType enumeration is defined twice (In the Server, and Client Side).
Hi,
How can I upload a model with multiple inputs? The distilbert example does not use multiple inputs but it's quite normal with pre-trained models. What should I pass to dtype
and shape
in this case?
Thanks.
We are currently using cbor for some of the serializing: transforming the flattened input tensors to a byte array.
This is probably overkill, and having a dependency on cbor is troublesome for porting the client library to other languages, like javascript, as on npmjs cbor packages are either old or nodejs-only.
I see two ways of doing this:
Dependency on cbor2 in the client and server side.
Let's not think about backward compat :)
None
Add a way to load policy.toml and certificate from bytes instead of a file
Quick and easy feature, should improve our API surface.
Add unit tests.
None.
This is due to BlindAI not supporting integer tensor output yet.
This was reported on discord, more info & the model are available there.
Model runs
Fails with error Failed to load model, the model or the input format are perhaps invalid
The notebook: https://cdn.discordapp.com/attachments/965734276593242202/965734464690978866/Confidential_STT.ipynb
Input: https://cdn.discordapp.com/attachments/965734276593242202/965734464892313640/hello_world.wav
last docker version & probably on master too
None
None
This is a mini roadmap for the CI.
List of things we might want in the CI (medium term plan):
Focus is on end to end tests and building client/server packages for now.
Side goals:
Potential future work:
There used to be some kind of tutorial about how to use them. But not anymore, so it might be difficult to set up the proper environment
Because they are awesome otherwise
This issue concerns the documentation
This issue is a collections of questions. The goal is to fill an FAQ page on the docs.
Here are some questions I thought about when working on the readme in #36
Maybe some questions regarding the direction of the project until we have a concrete roadmap:
Please add questions in this issue using comments (:wave: @JoFrost you told me you had some)
I'll assign this to myself unless someone else wants to work on the FAQ page :)
Describe the reason behind this modification
Add any additional information here.
We should do a more cautious evaluation of the quote in the case where we gotget a STATUS_TCB_SW_HARDENING_NEEDED.
This will require to update https://github.com/mithril-security/sgx-dcap-quote-verify-python
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.