Comments (10)
from remote-apis.
Yes, agreed, we should be more explicit in the spec! One thing I wanted to avoid is to spec out the full system configuration in the ServerCapabilities -- which endpoints there are altogether, what services do they support, and with what properties. I think it's useful to distinguish between properties of a particular endpoint vs properties of the service as a whole, and I chose to have ServerCapabilities represent the service as a whole, because that is a lot easier to describe (common elements, inter-dependencies), and also more useful to clients. But I see how that can be misleading -- should all endpoints in the system return all the capabilities, then? My idea was this:
If an execution endpoint is present in the system, it should return the ServerCapabilities of the whole system. All other endpoints should return the same response, except for ExecutionCapabilities.
It is a bit ugly idea, but it is simple.
One alternative I see would be to formalize the EndpointServices
idea and add something like:
message EndpointServices {
boolean action_cache = 1;
boolean cas = 2;
boolean execution = 3;
// Be more specific about services? Add others, like Operations?
}
message Endpoint {
string address = 1;
EndpointServices = 2;
}
message ServerCapabilities {
...
repeated Endpoint service_endpoints = 6;
repeated Endpoint current_endpoint = 7;
}
WDYT?
from remote-apis.
I'd suggest something similar to what BuildStream expects: each endpoint should implement the Capabilities service, and populate the fields relevant to the services available at that endpoint. Clients may choose to ignore irrelevant capabilities if the client does not plan to use a given service on the specified endpoint.
As an example, consider this setup:
-
endpointA hosts Exec+AC+CAS. The exec service at this endpoint is capable of using either the local CAS, or an external CAS hosted at a different endpoint. Calls to GetCapabilities return the relevant capabilities for all three services, but there's no explicit annotation that an external CAS is permitted.
-
endpointB hosts only CAS. Calls to GetCapabilities return only the capabilities relevant to the CAS.
A client specifying endpointA for Exec+AC+CAS would call GetCapabilities on endpointA, and use the returned values for all three services.
A client specifying endpointA for Exec+AC and endpointB for CAS would call GetCapabilities on endpointA and use the result for only Exec+AC, and would also call GetCapabilities on endpointB and use the result for CAS.
It's the responsibility of the user to understand what combinations are permitted by their services.
from remote-apis.
A couple thoughts:
What is the purpose of Capabilities? I think it serves two distinct but related purposes, which we may want to try to separate, though it's not that easy in practice. First, it allows for specifying the capabilities of a "service". Can a CAS implementation support symlinks? May priorities be used on Execution? Etc. Second, it allows (implicitly, today) - for specifying how endpoints relate. For example, the same digest function needs to be used across the different services; for most implementations Execution may only be used with the corresponding CAS (they're not freely composable), etc.
These feel separable, but I'm not sure that's true - even the "independent" capabilities are often not. For example, whether or not the CAS supports symlinks and whether or not you can send an Execution request that includes symlinks are related.
In RBE for a while we had planned to separate the CAS and Execution APIs onto separate endpoints. I still like it in concept, but we ultimately cancelled (or at least shelved indefinitely) that plan, in large part because of the complexity it introduced throughout the system - clients like bazel needed to gain complicated configuration knobs to allow specifying the multiple endpoints and their relationships; we lost an authoritative source for "capabilities" across the system; we had to answer questions about composability of constructs that aren't freely composable in practice, and it didn't buy us as much as we expected in terms of flexibility we'd actually capitalize on.
How can we make this usable in practice? That's really the ultimate goal here - separate endpoints and separate capabilities sound great, but it ultimately has to be borne out in configurations that are sane for clients and servers to implement and for users to configure, and we've struggled to achieve this in the past. One idea that was floated internally (I can't remember by who - @ulfjack or @ola-rozenfeld I think?) that I lean more and more towards is to have a single configuration API that returns capabilities as well as endpoints across all the different services. It would be responsible for returning internally consistent information - endpoints that work together, capabilities that aren't mutually contradictory, etc. But there are still open questions in my mind - what do we do for simple implementations that don't want to implement the capabilities API? If anyone actually wants to have composable endpoints, who hosts this API and how does it know what to return? Does this lead us to putting too much burden on clients to support a lot of configurability under the hood? Etc.
from remote-apis.
Relevant context: https://groups.google.com/d/topic/bazel-buildfarm/Jl8yDjjLNSs/discussion
from remote-apis.
For this iteration of the API, I strongly prefer NOT to switch to a "capabilities endpoint returns all the configuration information you need, including the relevant endpoints" model. That model may be preferable for a v3 (or later) version of the API, but it introduces significant complexity in the clients, for the sake of a benefit that's not yet really clear. I think we don't yet really know what configurations of endpoints are practically useful, and it would be better to determine that over time than designing a complicated system to manage arbitrary complexity that ends up not being necessary.
from remote-apis.
Each endpoint should implement the Capabilities service, and populate the fields relevant to the services available at that endpoint. Clients may choose to ignore irrelevant capabilities if the client does not plan to use a given service on the specified endpoint.
I would prefer that option too, think it's the most intuitive solution.
It's the responsibility of the user to understand what combinations are permitted by their services.
Very much agreed, and least in the context of the v2 API:
One idea that was floated internally (...) that I lean more and more towards is to have a single configuration API that returns capabilities as well as endpoints across all the different services. It would be responsible for returning internally consistent information - endpoints that work together, capabilities that aren't mutually contradictory, etc.
Inverting the current logic and have the service advertise configuration for its client indeed is a interesting idea. But I quite agree with @bergsieker here, this would be a major change that we probably don't want to sneak into v2 of the API.
from remote-apis.
Yes, sorry, I didn't mean to imply that we should make such a significant change to the semantics of V2 (we couldn't, it's a breaking change). To that end, yeah, Steven's suggested approach for V2 SGTM as well. I'm interested to see how well "It's the responsibility of the user to understand what combinations are permitted by their services" will hold up in practice - if it does, great, no need for any major restructurings at all :).
from remote-apis.
from remote-apis.
Am I understanding correctly that the difference between the suggestion and the status quo is that an execution-only endpoint would no longer return CacheCapabilities?
At least it would be acceptable for such an endpoint (Exec only) not to, and clients should ignore it if-ever it was returned by such an endpoint anyway.
Each endpoint should implement the Capabilities service, and populate the fields relevant to the services available at that endpoint.
My understanding then is:
- Exec+CAS+AC endpoints should return
CacheCapabilities
+ExecutionCapabilities
- Exec endpoints should return
ExecutionCapabilities
- CAS+AC endpoints should return
CacheCapabilities
- ...
from remote-apis.
Related Issues (20)
- REv3 idea: Make is_topologically_sorted the default, and eliminate tag bytes
- Let exit_code be better aligned with C/POSIX
- REv3 idea: Make use of digest_function in requests mandatory
- REv3: Use IPLD (CIDs, DAG-PB, etc.)
- Chyba
- CAS: Existence Caching in Intermediate Caches (user experience report) HOT 2
- Please tag REv2 2.1.0 2.2.0 [...] HOT 6
- API extension for Git hashes HOT 1
- Googleapis is outdated HOT 1
- Should we make a resolution to NOT have a v3? HOT 2
- Support compression with external dictionary HOT 6
- Add supported_max_cas_entry_size property to CacheCapabilities HOT 2
- Bazel version to use to run hooks/pre-commit unclear HOT 1
- REv3: Reduce asymmetry between O(n) output files and O(1) output directories HOT 2
- Platform standardization HOT 1
- [Discussion] Make CAS blobs tied to ActionKey to improve sharding locality HOT 12
- Cache Capability to indicate that CAS is read-only HOT 2
- Remote Asset API: clarification about Qualifier HOT 2
- Allow abitrary tagging in RequestMetadata
- Support range downloads in the Remote Asset API HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from remote-apis.