envoyproxy / go-control-plane Goto Github PK
View Code? Open in Web Editor NEWGo implementation of data-plane-api
License: Apache License 2.0
Go implementation of data-plane-api
License: Apache License 2.0
Hello,
I've been using the go-control-plane for a little bit now and in that time there have been several breaking changes to the API. It's not a big deal but it's unexpected work. My suggestion is to start tagging releases after a breaking change (v0.8
, v0.9
, v0.10
, etc) so it's easier to pin specific versions in glide.
OpenCensus is a library for go that has extensible export backends:
https://github.com/census-instrumentation/opencensus-go
This should be a good choice to keep this library vendor-neutral.
The structure of the Filter
message has changed in v1.9 so that config
became config_type
. Now I get the error in Envoy's logs when the log level is set to debug
gRPC config for type.googleapis.com/envoy.api.v2.Listener rejected: Error adding/updating listener fff: Proto constraint validation failed (HttpConnectionManagerValidationError.StatPrefix: ["value length must be at least " '\x01' " bytes"]):
even though StatPrefix
in HttpConnectionManager
is set properly
here is my http_connection_manager
hcm := envoyhcm.HttpConnectionManager{
CodecType: envoyhcm.AUTO,
StatPrefix: "BLABLABLA",
RouteSpecifier: &envoyhcm.HttpConnectionManager_RouteConfig{
RouteConfig: <<<MyRouteConfig>>>,
},
}
here is my filterChains
anyHCM, err := types.MarshalAny(hcm)
if err != nil {
panic(err)
}
filterChains := []envoylistener.FilterChain{
envoylistener.FilterChain{
Filters: []envoylistener.Filter{
{
Name: util.HTTPConnectionManager,
ConfigType: &envoylistener.Filter_TypedConfig{
TypedConfig: anyHCM,
},
},
},
},
}
and here is my listener
listener := api.Listener{
Name: "my_listener",
Address: *mkAddress("127.0.0.1", uint32(4589)),
FilterChains: *filterChains,
}
I am not sure whether this is a bug but surely this only applies for v1.9+ so maybe http_connection_manager
isn't properly serialized before sending the response
this pkg/test/main little test program is very useful to understand the envoy control plane, may be we should put a link in this repo's readme to give fresh user (like me) a good start.
Hi all
Panic occurred when my external authorization service marshal OkResponse.
Below are stack trace(out-of-order, the first line should be deepest trace):
github.com/envoyproxy/go-control-plane/envoy/api/v2/core.(*HeaderValueOption).MarshalTo(0xc0004c6d50, 0xc000e462ae, 0x4, 0x4, 0x16, 0x6ab, 0x0)
created by google.golang.org/grpc.(*Server).serveStreams.func1
/build/vendor/google.golang.org/grpc/server.go:802 +0x86
/build/vendor/github.com/envoyproxy/go-control-plane/envoy/service/auth/v2alpha/external_auth.pb.go:592 +0x15d
github.com/envoyproxy/go-control-plane/envoy/service/auth/v2alpha.(*CheckResponse).MarshalTo(0xf2e460, 0xc000e45c00, 0x6b2, 0x6b2, 0x6b2, 0x6b2, 0xa3ca40)
github.com/envoyproxy/go-control-plane/envoy/service/auth/v2alpha.(*CheckResponse_OkResponse).MarshalTo(0xf2b4b8, 0xc000e45c00, 0x6b2, 0x6b2, 0xc000795950, 0x440057, 0x700)
google.golang.org/grpc/encoding/proto.codec.Marshal(0xa3ca40, 0xf2e460, 0x1, 0xc000795a68, 0xc000795b88, 0x92e3d2, 0xc000795bc0)
/build/vendor/github.com/envoyproxy/go-control-plane/envoy/service/auth/v2alpha/external_auth.pb.go:624 +0xdf
github.com/envoyproxy/go-control-plane/envoy/service/auth/v2alpha.(*OkHttpResponse).MarshalTo(0xf402e0, 0xc000e45c03, 0x6af, 0x6af, 0x6c1, 0x3, 0xc000152e00)
github.com/envoyproxy/go-control-plane/envoy/api/v2/core.(*HeaderValue).MarshalTo(0xc00049e900, 0xc000e462b0, 0x2, 0x2, 0x12, 0x2, 0x0)
/build/vendor/github.com/envoyproxy/go-control-plane/envoy/api/v2/core/base.pb.go:1878 +0x27d
/build/vendor/google.golang.org/grpc/server.go:1139 +0xd58
/build/vendor/google.golang.org/grpc/server.go:681 +0xa1
google.golang.org/grpc.(*Server).sendResponse(0xc000001680, 0xafabc0, 0xc000097380, 0xc0004b2400, 0xa3ca40, 0xf2e460, 0x0, 0x0, 0xc000898827, 0x0, ...)
goroutine 2175 [running]:
/build/vendor/github.com/envoyproxy/go-control-plane/envoy/service/auth/v2alpha/external_auth.pb.go:553 +0x16d
/build/vendor/github.com/envoyproxy/go-control-plane/envoy/api/v2/core/base.pb.go:1908 +0xe5
/build/vendor/google.golang.org/grpc/server.go:683 +0x9f
panic: runtime error: index out of range
google.golang.org/grpc.(*Server).handleStream(0xc000001680, 0xafabc0, 0xc000097380, 0xc0004b2400, 0x0)
google.golang.org/grpc.encode(0x7f87ea759030, 0xf5dd70, 0xa3ca40, 0xf2e460, 0xf5dd70, 0x9ac4a0, 0xaefd40, 0x0, 0x0)
github.com/envoyproxy/go-control-plane/envoy/service/auth/v2alpha.(*CheckResponse).Marshal(0xf2e460, 0xa3ca40, 0xf2e460, 0x7f87ea7590e8, 0xf2e460, 0x1)
google.golang.org/grpc.(*Server).serveStreams.func1.1(0xc000898600, 0xc000001680, 0xafabc0, 0xc000097380, 0xc0004b2400)
/build/vendor/google.golang.org/grpc/encoding/proto/proto.go:70 +0x19c
/build/vendor/google.golang.org/grpc/server.go:957 +0x514
/build/vendor/google.golang.org/grpc/rpc_util.go:511 +0x5e
google.golang.org/grpc.(*Server).processUnaryRPC(0xc000001680, 0xafabc0, 0xc000097380, 0xc0004b2400, 0xc000150ae0, 0xf2ca90, 0x0, 0x0, 0x0)
/build/vendor/github.com/envoyproxy/go-control-plane/envoy/service/auth/v2alpha/external_auth.pb.go:569 +0x7f
I'm using go module for my dependency management.
require (
github.com/dgrijalva/jwt-go v3.2.0+incompatible
github.com/envoyproxy/go-control-plane v0.6.3
github.com/fsnotify/fsnotify v1.4.7
github.com/gogo/googleapis v1.1.0
github.com/gogo/protobuf v1.1.1
github.com/lyft/protoc-gen-validate v0.0.11 // indirect
golang.org/x/net v0.0.0-20181114220301-adae6a3d119a // indirect
google.golang.org/grpc v1.16.0
)
1.11
Created xDS GRPC v2 API for envoy as suggested in go-control-plane/pkg/test/main/main.go
. Callback, Management Server and Gateway are same, AccessLogServer is removed. It works perfectly and CDS, RDS and LDS are updated successfully.
The problem is from envoyproxy (docker) service. At first, on change of snapshot, GRPC API sends StreamResponse, then envoyproxy does StreamRequest and updates xDS. After ~15mins on change of snapshot, GRPC API still sends StreamResponse but envoyproxy does no StreamRequest and hens no xDS is updated. After this, if I restart envoyproxy, StreamRequest is called, xDS is updated. Problem reappears after ~15mins of each envoy proxy restart.
Something I noticed:
func (cb *callbacks) OnStreamResponse(id int64, req *v2.DiscoveryRequest, res *v2.DiscoveryResponse) {}
id here changes only when I restart envoyproxy. For a session of envoyproxy it stays same for any number of StreamResponse and StreamRequest. Does it meant to stay same or should it change for each Request?My envoy_config_yml:
admin:
access_log_path: /dev/null
address:
socket_address:
address: 0.0.0.0
port_value: 9901
dynamic_resources:
ads_config:
api_type: GRPC
refresh_delay: 30s
cluster_names:
- xds_cluster
cds_config:
ads: {}
lds_config:
ads: {}
node:
cluster: my-cluster
id: mystack
static_resources:
clusters:
- connect_timeout: 1s
hosts:
- socket_address:
address: envoy-discovery-service # docker service name
port_value: 18000
http2_protocol_options: {}
name: xds_cluster
type: logical_dns
dns_lookup_family: V4_ONLY
dns_refresh_rate: 10s
Note: I am using one docker stack with envoyproxy
and envoy-discovery-service
as 2 different services.
Due to temporary protocol issues, Envoy is pinned to 3/7/19 build here https://github.com/envoyproxy/go-control-plane/pull/158/files#diff-3254677a7917c6c01f55212f86c57fbfR11.
Unpin it once resolved.
In ADS mode, it is preferable to respond to xDS queries in the following order, to match the topological ordering of cross-references between xDS resources:
We should support sequencing the responses in this order. This is mostly applicable to ADS mode, since in xDS mode, snapshots may be partial (e.g. only RDS).
Is there a way we can use docker service eg webapp_dev_webapp.1.no9rqsfuz73k026uod3hsjk83
as core.SocketAddress.Address
?
Docker service name for a static cluster address defined in envoy config yaml works well and resolves fine, as cluster type is logical_dns. For dynamic Envoy config, cluster type = EDS, and each cluster has its Endpoints. If I provide normal IP Address and port for each Endpoint, it works fine.
But each of my services is hosted in docker, and there is an issue using docker's internal ipaddress which is in subnet format. My option is to use docker services' Host name, as envoy itself is in same docker network, resolving hostname should not be a problem.
I initialized Endpoint as
endpoints := make([]cache.Resource, 1)
var eps []endpoint.LbEndpoint
eps = append(eps, endpoint.LbEndpoint{Endpoint: &endpoint.Endpoint{
Address: &core.Address{
Address: &core.Address_SocketAddress{
SocketAddress: &core.SocketAddress{
Protocol: core.TCP,
Address: "webapp_dev_webapp.1.no9rqsfuz73k026uod3hsjk83",
ResolverName: "LOGICAL_DNS",
PortSpecifier: &core.SocketAddress_PortValue{
PortValue: uint32(8080),
},
},
},
},
},
})
endpoints[0] = &v2.ClusterLoadAssignment{
ClusterName: "my_cluster",
Endpoints: []endpoint.LocalityLbEndpoints{{
LbEndpoints: eps,
}},
}
var edsSource *core.ConfigSource
edsSource = &core.ConfigSource{
ConfigSourceSpecifier: &core.ConfigSource_Ads{
Ads: &core.AggregatedConfigSource{},
},
}
cluster := &v2.Cluster{
Name: "my_cluster",
ConnectTimeout: 5 * time.Second,
Type: v2.Cluster_EDS,
EdsClusterConfig: &v2.Cluster_EdsClusterConfig{
EdsConfig: edsSource,
},
}
// some listener and routes too
After generating and setting snapshot with above settings, envoys updates LDS, RDS, CDS but throws error in EDS as
[warning][config] bazel-out/k8-opt/bin/source/common/config/_virtual_includes/grpc_mux_subscription_lib/common/config/grpc_mux_subscription_impl.h:70] gRPC config for type.googleapis.com/envoy.api.v2.ClusterLoadAssignment rejected: Unknown address resolver: LOGICAL_DNS. Consider setting resolver_name or setting cluster type to 'STRICT_DNS' or 'LOGICAL_DNS'
.
Is docker hostname via EDS cluster not supported yet or am I missing something?
Currently, we need two separate identities for proxies:
In the current implementation, hey are treated as 1-1, which means there as many snapshots as there are proxies. In some cases, this is necessary, if all proxies are indeed different. But in many cases, proxies should receive the same configuration if grouped appropriately.
When trying to add a simple rate limiting into an implementation of Callbacks, it came to me that OnStreamRequest() cannot close stream processing as it does not return an error. It could be a common case to allow server close stream on specific requests.
Envoy is temporarily pinned to 3/7/19.
https://github.com/envoyproxy/go-control-plane/pull/158/files#diff-3254677a7917c6c01f55212f86c57fbfR11
The management server has to maintain a cache of the desired configurations for all proxies in the mesh. Since there could be a lot more proxies that different configurations, it seems we want to be able to group proxy nodes into buckets.
Let's start with a generic bucketing function:
function nodeGroup(proxy api.Node): string
For example, we can use the cluster field in the node to group API responses if the cluster corresponds to the service cluster. Alternatively, we can use the generic metadata field in the node message to assign configuration to proxies (e.g. in kubernetes, we may use pod labels to uniequely identify sidecars).
The cache is a map from the node group and response type to the desired response:
var cache map[NodeGroup][ResponseType]proto.Message
Revision: 39c73ae5ba014b648d4164a24340a78b6b761978
I am building a docker image with an instance of the Aggregate Discovery Service that imports this library as a package. I am receiving a syntactical error from cache.go
:
vendor/github.com/envoyproxy/go-control-plane/pkg/cache/cache.go:25: syntax error: unexpected = in type declaration
make: *** [build] Error 2
I'm not entirely sure why this is happening because line 25 is valid syntax.
I am using dep for my dependency management.
[[projects]]
branch = "master"
name = "github.com/envoyproxy/go-control-plane"
packages = [
"envoy/api/v2",
"envoy/api/v2/auth",
"envoy/api/v2/cluster",
"envoy/api/v2/core",
"envoy/api/v2/endpoint",
"envoy/api/v2/listener",
"envoy/api/v2/ratelimit",
"envoy/api/v2/route",
"envoy/config/accesslog/v2",
"envoy/config/bootstrap/v2",
"envoy/config/filter/accesslog/v2",
"envoy/config/filter/fault/v2",
"envoy/config/filter/network/http_connection_manager/v2",
"envoy/config/filter/network/tcp_proxy/v2",
"envoy/config/metrics/v2",
"envoy/config/ratelimit/v2",
"envoy/config/trace/v2",
"envoy/service/accesslog/v2",
"envoy/service/discovery/v2",
"envoy/type",
"pkg/cache",
"pkg/log",
"pkg/server",
"pkg/test",
"pkg/test/resource",
"pkg/util"
]
revision = "39c73ae5ba014b648d4164a24340a78b6b761978"
Most of our disco/metrics/etc. infrastructure is built around java, so I've been working on a java port of go-control-plane
. Would there be interest in creating a new repo under the envoyproxy
org for it?
I've basically followed the same design as go-control-plane
, with just some minor tweaks to make it more idiomatic for the java world.
We need a linter check so that we do not accidentally generate non-deterministic marshalling code.
Would it make sense to add a comparison method in the simple cache, where you can send an old snapshot and compare it to what you want the snapshot to be, comparing everything except version?
If using a timer based updater (naive approach), this would allow us to not cause proxies to update unless changes occur. There likely is a better approach to this.
ADS mode requires a hold on the response until the entire set of snapshot names is requested (to simplify version tracking of partial name subsets). That means, Snapshot
must be self-consistent when submitted to the cache; it should have as many cluster load assignments as there are clusters. However, the client of the library, may want to use one pool of cluster load assignments for all proxies and all snapshots. We should relax the consistency check to validate that the set of clusters is a subset of cluster load assignments, not a precise match.
That should help with developer experience since 1.10 manages build caches much better.
The duration style fields in core.Healthcheck
are defined using the gogo/types Duration struct
type HealthCheck struct {
// The time to wait for a health check response. If the timeout is reached the
// health check attempt will be considered a failure.
Timeout *google_protobuf3.Duration `protobuf:"bytes,1,opt,name=timeout" json:"timeout,omitempty"`
// The interval between health checks.
Interval *google_protobuf3.Duration `protobuf:"bytes,2,opt,name=interval" json:"interval,omitempty"`
Elsewhere in the core package the more convenient time Duration type is used
type ApiConfigSource struct {
// ...
RefreshDelay *time.Duration `protobuf:"bytes,3,opt,name=refresh_delay,json=refreshDelay,stdduration" json:"refresh_delay,omitempty"`
}
Why are there differences in the two definitions? Is it possible to change Healthcheck to use *time.Duration?
This is a feature request.
From the conclusion, How about adding a function to generate snapshot from yaml or json file?
Like this:
import "pkg/cache"
func NewSnapshotFromYAML(filePath string) cache.Snapshot {}
I think we need the abstracted API, if we want to manage the resource that becomes the response of xDS with yaml.
For example, in case we want to review the data-plane configuration with multiple people, we need to manage them with yaml or json files.
Hi @kyessenov,
we would like to use some of the more recent schema changes, but would prefer a stable tag. Do have time to cut one in the next few days?
Thanks!
github.com\envoyproxy\go-control-plane\envoy\api\v2\cds.pb.go:2226:21: c.cc.NewStream undefined (type *grpc.ClientConn has no field or method NewStream)
It seems that we should optionally support an ordering operator on the versions, to make it easier to load balance xDS requests between control plane servers. If a server receives a request for a newer version, it should withhold replying until it catches up. For example, let's say we go through versions v1, v2, v3, and the state is:
If envoy makes a request to server1, then server1 should withhold until it receives v3 rather than replying with v1.
So with Turbine Labs going away, there's not going to be anybody around to work on Rotor. Rotor is a control plane based on go-control-plane that adds EDS integration with common service discovery registries like Kubernetes, Consul, and AWS (EC2 tags and ECS).
If the Envoy project wants it, it's yours.
The community needs an out-of-the-box control plane
Rotor exists based on Turbine Labs' experience trying to get people up and running with Envoy at scale, both as a support vendor and a community member (mostly Slack). Most people figure out quickly that they need a control plane, and they typically disappear for 2-4 weeks trying to write an EDS integration. They generally win the battle (producing a control plane) but lose the war (maintaing the control plane becomes a drag on successfully deploying Envoy everywhere).
The primary motivation for users is EDS integration with either Consul or Kubernetes. Rotor solves this without writing any code, which has been a huge win for a lot of folks.
In addition, the notion of a centralized control plane has been adopted broadly, and Rotor provides some facility for defining listeners and routes in a single place (a config file for Rotor, which is roughly the same format as Envoy LDS). Many users find this operationally preferable to managing and distributing static config files, as it gets them on a path to use xDS over gRPC quickly.
Most of these can be overcome with some work, but I'll put all the cards on the table (to the best of my ability).
Rotor used to be commercially backed. In particular, it has code to connect to the Turbine Labs API, which is no longer functional. It's easy enough to ignore, since that code path is only used when you pass an API key. But perhaps one would want to remove the CLI help text or the code soon after adopting the project.
Rotor's documentation is partially hosted on the Turbine Labs blog. The blog isn't going away (thanks, Medium!), but one might want to fold the intro post or the big update post into the Rotor's README.
Istio's Pilot has some of the same goals. Strategically, it may be more interesting to adopt Pilot and put work towards that. Rotor's primary advantage was always with folks with significant non-Kubernetes deployments, and there are a lot of those people out there. I suspect it will be a while before Pilot can be the blessed solution for typical users, because it's going to be a while before most people are 100% on Kubernetes. Even then, Istio's momentum is real, and having two suggested control planes may not be a win for users.
Rotor is not a particularly robust solution for LDS and RDS. It's just another place to write RDS and LDS configuration. Turbine Labs felt that this was our commercial wheelhouse, so Rotor was never going to, e.g. read routes from Consul tags and set up routes based on that. We've seen users adopt Rotor as a way to configure LDS/RDS with just config file vs. writing code for go-control-plane, but this area of functionality could be much more full-featured.
If you ask me, I think there's value in at least having Rotor brought under the Envoy / CNCF care as a companion project. There's a real need for a more standardized control plane story, whether that's Rotor or some other project.
Happy to discuss anything else in this issue, if there are other questions or concerns about the project!
This might be
I have implemented ADS server for a basic control plane using this package. I can send the data with proper versioning but for a reason i cannot understand, envoy keep closing the grpc stream which get's logged as a warning (see below). This might be the expected behavior but it feels somewhat wrong to me. My undertstanding of grpc is to reuse existing connections as much as possible.
Now I have experimented with the KeepaliveEnforcementPolicy and KeepaliveParams on the grpc server but it doesn't change anything. As far as i know, everything is pretty much vanilla.
I understand this might be a problem in https://github.com/envoyproxy/envoy too but i though to ask here before.
envoy_1 | [2019-01-08 18:46:49.320][000006][info][main] [source/server/server.cc:463] starting main dispatch loop
envoy_1 | [2019-01-08 18:46:49.325][000006][info][upstream] [source/common/upstream/cluster_manager_impl.cc:132] cm init: initializing cds
envoy_1 | [2019-01-08 18:46:49.567][000006][warning][upstream] [source/common/config/grpc_mux_impl.cc:268] gRPC config stream closed: 0,
envoy_1 | [2019-01-08 18:46:50.275][000006][warning][upstream] [source/common/config/grpc_mux_impl.cc:268] gRPC config stream closed: 0,
See https://github.com/DecipherNow/fabric-experiments/tree/master/metrics-test
This has 1 http client and 1 grpc client each talking to a 'store' service which talks to a 'bank' service. The clients send simple requests at random intervals. The servers add random amounts of latency.
To facilitate testing dynamic routing I propose to add:
This will prepare for the test platform which will have
Maintaining a cache of responses requires garbage collection of response for stale nodes.
Since the goal for now is a total cache (e.g. the entire response cache is loaded on-demand/a-priori), the right strategy seems to be one of or combination of:
Blocked on bufbuild/protoc-gen-validate#52
Generated API uses golang.org/x/net/context".Context should use context".Context
../github.com/envoyproxy/go-control-plane/pkg/server/server.go:65:41: cannot use server literal (type *server) as type Server in return argument:
*server does not implement Server (wrong type for FetchClusters method)
have FetchClusters("context".Context, "github.com/envoyproxy/go-control-plane/envoy/api/v2".DiscoveryRequest) ("github.com/envoyproxy/go-control-plane/envoy/api/v2".DiscoveryResponse, error)
want FetchClusters("golang.org/x/net/context".Context, "github.com/envoyproxy/go-control-plane/envoy/api/v2".DiscoveryRequest) ("github.com/envoyproxy/go-control-plane/envoy/api/v2".DiscoveryResponse, error)
Compilation finished with exit code 2
Is there an interest in switching to gogoprotos from goprotos?
The generated code looks a lot more pleasant to work with (oneofs are terrible in goproto), plus we may get insubstantial performance improvements.
WDYT?
We should make some basic ping-pong style tests for the server. That requires a client. Should we mock Envoy or just use the real thing? If we use a real envoy, we can probably query its admin interface to validate config application.
We've run into an issue when using Envoy with a Snapshot Cache configured for ADS mode and statically defined clusters that are discovered through EDS.
This comes from trying to use EDS to discover the endpoints for the tracing service, for instance, in our Envoy config we have:
tracing:
http:
name: envoy.zipkin
config:
collector_cluster: trace-collector
collector_endpoint: /api/v1/spans
And then a static cluster configured for EDS:
static_resources:
clusters:
- name: trace-collector
connect_timeout: 0.25s
lb_policy: ROUND_ROBIN
type: EDS
eds_cluster_config:
eds_config:
api_config_source:
api_type: GRPC
grpc_services:
- envoy_grpc:
cluster_name: xds_cluster
We've noticed that if we configure the rest of Envoy's CDS and EDS to use the ADS stream, we run into a deadlock on initial Envoy boot.
I've traced the request flow and it goes like this:
trace-collector
endpoints through EDS.trace-collector
.trace-collector
resource, since this was already discovered separately.To work around this issue, we have to exclude trace-collector
from the snapshot used for ADS.
My question is, what is the motivation behind ensuring the requested resources are a superset of the cached snapshot resources? My gut feeling is that the other way around makes more sense: the cached snapshot is a superset of the requested resources.
Hi. Thank you very much for the implementation. This helps me a lot.
I tried this but faced an error at cache.Consitent(). It compares with the numbers of endpoints, but GetResourceReference(cluster items) returns the number of ClusterConfig, not the number of endpoints. It returns at most one for each cluster item.
https://github.com/envoyproxy/go-control-plane/blob/master/pkg/cache/snapshot.go#L74-L75
https://github.com/envoyproxy/go-control-plane/blob/master/pkg/cache/resource.go#L87-L95
How to run the control plane, able to run all make command.
Not sure, how can i run start this control plane, does some more work is required ?
Update:
Now able to run to run it by: make integration.rest
--> building test binary
env XDS=rest build/integration.sh
INFO[0000] upstream listening HTTP/1.1 port=18080
INFO[0000] access log server listening port=18090
INFO[0000] gateway listening HTTP/1.1 port=18001
INFO[0000] waiting for the first request...
INFO[0000] management server listening port=18000
I see something like this, so some endpoint are up where I can listen to, still if I can have some reference will really be helpful, how can I run RDS with any platform or any pointers in direction, where I can go figure out !
I try this project to wirte my own CDS, however i found the grpc interface changed.
In this project, it's
func (*server) StreamClusters(v2.ClusterDiscoveryService_StreamClustersServer) error {
pass
}
but in date-plane-api, it change to
service ClusterDiscoveryService {
rpc StreamClusters(stream DiscoveryRequest) returns (stream DiscoveryResponse) {
}
so, have it change?
xDS protocol was designed with eventual consistency goal in mind. Under certain circumstances, there is a possibility of traffic loss in the middle of configuration update. For example, a transition from a listener L1 using cluster C1 to a listener L2 using cluster C2 using two CDS and LDS would cause a brief loss in the inconsistent state L1, C2 or L2, C1. The way around this problem is to sequence updates so that resources are not removed while in use. This requires using ADS to multiplex all resource updates over a single stream, as well as the sequencing logic in the management server. In this scenario, the correct sequence is:
We can model this problem as batching atomic updates. The server receives a logical operation to update the desired configuration to {L2, C2}. The server keeps track of {L1, C1} applied remotely at the proxy. The server then creates a sequence of updates as above and streams the sequence of updates to CDS and LDS types through ADS.
For this to happen, the server has to take the logical batched updates to multiple xDS resources types at a time as input. The server also has to maintain the remote state of applied xDS resources to correctly compute the diff and remove stale resources from the remote proxies.
I miss the status endpoint of the lyft/discovery reference SDS implementation. Is there a way to get the same information using the v2 messages?
Assuming the following valid Envoy JSON configuration file (validated by actually running an envoy instance with this):
{
"admin":{
"access_log_path":"/dev/null"
},
"node":{
"cluster":"example-service-1",
"id":"id-1"
},
"static_resources":{
"clusters":[
{
"connect_timeout":"2.00s",
"lb_policy":1,
"name":"cluster1",
}
],
"listeners":[
{
"address": {
"socket_address": {
"address": "0.0.0.0",
"port_value": 9090,
"ipv4_compat": true
}
},
"name": "example-service",
"filter_chains":[{
"filters":[{
"name":"envoy.http_connection_manager",
"drain_timeout": "3000s",
"stat_prefix":"ingress_http",
"config":{
"access_log":[
{
"config":{
"path":"/dev/stdout"
},
"name":"envoy.file_access_log"
}
],
"codec_type":"auto",
"http_filters":[
{
"name":"envoy.router"
}
],
"route_config":{
"virtual_hosts":[
{
"domains":[
"*"
],
"name":"backend",
"routes":[
{
"match":{
"prefix":"/"
},
"route":{
"cluster":"cluster1",
"timeout": "3000s",
"prefix_rewrite": "/"
}
}
]
}
]
}
}
}]
}]
}
]
}
}
Why would the following unmarshaling code fail with this error:
bootstrap := &v2.Bootstrap{}
err := jsonpb.Unmarshal(r, bootstrap)
if err != nil {
return nil, stacktrace.Propagate(err, "failed in unmarshalling config bytes into bootstrap object")
}
return bootstrap, nil
failed in unmarshalling config bytes into bootstrap object
--- at confutil/config.go:91 (unmarshal) ---
Caused by: unknown field "drain_timeout" in listener.Filter
drain_timeout
is a valid param in listener.Filter
unless I am misunderstanding something. For clarification we are using the following package "github.com/gogo/protobuf/jsonpb"
to handle the unmarshaling of JSON into the proto structs.
Please use this issue to discuss general setup. I want to agree on a bunch of things before anyone starts pushing code:
Hello,
Can we please have a new tagged release to pick up #102
If this tag could be in semver form that would be very helpful in letting me transition my project from Dep to Go 1.11's module system.
Thank you.
I want to make some changes to go-control-plane locally. Prior to making any modifications, I decided to run the integration tests before doing anything else. Surprisingly, they failed:
make integration.docker
--> building Linux test binary
--> building test docker image
Sending build context to Docker daemon 182.3MB
Step 1/5 : FROM envoyproxy/envoy:latest
---> cf2442a79328
Step 2/5 : ADD sample /sample
---> Using cache
---> 901328cc36a6
Step 3/5 : ADD build/integration.sh build/integration.sh
---> Using cache
---> e9f5d2931ab0
Step 4/5 : ADD bin/test-linux /bin/test
---> Using cache
---> 21abea4c5a7c
Step 5/5 : ENTRYPOINT build/integration.sh
---> Using cache
---> c430b556d974
Successfully built c430b556d974
Successfully tagged test:latest
docker run -it -e "XDS=ads" test -debug
INFO[0000] upstream listening HTTP/1.1 port=18080
INFO[0000] waiting for the first request...
INFO[0000] gateway listening HTTP/1.1 port=18001
INFO[0000] access log server listening port=18090
INFO[0000] management server listening port=18000
ERRO[0000] http: Server closed
ERRO[0000] http: Server closed
DEBU[0000] stream 1 open for
DEBU[0000] open watch 1 for type.googleapis.com/envoy.api.v2.Cluster[] from nodeID "test-id", version ""
INFO[0000] initial snapshot {Xds:ads Version: UpstreamPort:18080 BasePort:9000 NumClusters:4 NumHTTPListeners:2 NumTCPListeners:2}
INFO[0000] executing sequence requests=5 updates=3
INFO[0000] update snapshot version=v0
DEBU[0000] respond open watch 1[] with new version "v0"
DEBU[0000] respond type.googleapis.com/envoy.api.v2.Cluster[] version "" with version "v0"
DEBU[0000] respond type.googleapis.com/envoy.api.v2.ClusterLoadAssignment[cluster-v0-3 cluster-v0-2 cluster-v0-1 cluster-v0-0] version "" with version "v0"
DEBU[0000] open watch 2 for type.googleapis.com/envoy.api.v2.Cluster[] from nodeID "test-id", version "v0"
INFO[0000] request batch batch=0 failed=4 ok=0 pass=false
DEBU[0000] respond type.googleapis.com/envoy.api.v2.Listener[] version "" with version "v0"
DEBU[0000] open watch 3 for type.googleapis.com/envoy.api.v2.ClusterLoadAssignment[cluster-v0-3 cluster-v0-2 cluster-v0-1 cluster-v0-0] from nodeID "test-id", version "v0"
DEBU[0000] respond type.googleapis.com/envoy.api.v2.RouteConfiguration[route-v0-1 route-v0-0] version "" with version "v0"
DEBU[0000] open watch 4 for type.googleapis.com/envoy.api.v2.Listener[] from nodeID "test-id", version "v0"
DEBU[0000] open watch 5 for type.googleapis.com/envoy.api.v2.RouteConfiguration[route-v0-1 route-v0-0] from nodeID "test-id", version "v0"
INFO[0000] request batch batch=1 failed=4 ok=0 pass=false
INFO[0001] request batch batch=2 failed=4 ok=0 pass=false
INFO[0001] request batch batch=3 failed=4 ok=0 pass=false
INFO[0002] request batch batch=4 failed=4 ok=0 pass=false
INFO[0002] server callbacks fetches=0 requests=8
ERRO[0002] failed all requests in a run 0
Envoy log: envoy.log
Makefile:89: recipe for target 'integration.docker' failed
make: *** [integration.docker] Error 1
Note these lines:
ERRO[0000] http: Server closed
ERRO[0000] http: Server closed
If you look at the code for one, you can see:
func RunHTTP(ctx context.Context, upstreamPort uint) {
log.WithFields(log.Fields{"port": upstreamPort}).Info("upstream listening HTTP/1.1")
server := &http.Server{Addr: fmt.Sprintf(":%d", upstreamPort), Handler: echo{}}
go func() {
if err := server.ListenAndServe(); err != nil {
log.Error(err)
}
}()
if err := server.Shutdown(ctx); err != nil {
log.Error(err)
}
}
It's not waiting for func() to actually finish, so it just closes the server immediately. I'm guessing that on my laptop, this happens almost always, whereas in circleci the timing is different?
I've fixed it locally by using ctx.Done() like the gRPC server methods.
I'm using go 1.11.
I tested this against e79e039 and revert_grpc_version_update
I'll submit a PR soon.
Hello,
Is it possible to get a new tagged released of this package including #92 and friends?
Thanks
Dave
/cc @kyessenov
The code in package server imports package context
, but the generated code in package api
references x/net/context
. In addition, the code references a bunch of packages that are missing from the vendor directory.
On a clean checkout after running make depend.install
, make test
and make build
fail (on go 1.8.3; Ubuntu). Output below the fold:
$ make test
--> running unit tests
# github.com/envoyproxy/go-control-plane/pkg/server
pkg/server/server.go:68: cannot use server literal (type *server) as type Server in return argument:
*server does not implement Server (wrong type for FetchClusters method)
have FetchClusters("context".Context, *api.DiscoveryRequest) (*api.DiscoveryResponse, error)
want FetchClusters("github.com/envoyproxy/go-control-plane/vendor/golang.org/x/net/context".Context, *api.DiscoveryRequest) (*api.DiscoveryResponse, error)
# github.com/envoyproxy/go-control-plane/pkg/server
pkg/server/server.go:68: cannot use server literal (type *server) as type Server in return argument:
*server does not implement Server (wrong type for FetchClusters method)
have FetchClusters("context".Context, *api.DiscoveryRequest) (*api.DiscoveryResponse, error)
want FetchClusters("github.com/envoyproxy/go-control-plane/vendor/golang.org/x/net/context".Context, *api.DiscoveryRequest) (*api.DiscoveryResponse, error)
[... snip successful tests in other packages ...]
and
$ make build
vendor/github.com/lyft/protoc-gen-validate/checker.go:13:2: cannot find package "github.com/lyft/protoc-gen-star" in any of:
$GOPATH/src/github.com/envoyproxy/go-control-plane/vendor/github.com/lyft/protoc-gen-star (vendor tree)
/usr/lib/go-1.8/src/github.com/lyft/protoc-gen-star (from $GOROOT)
$GOPATH/src/github.com/lyft/protoc-gen-star (from $GOPATH)
vendor/github.com/lyft/protoc-gen-validate/tests/harness/executor/harness.go:13:2: cannot find package "github.com/lyft/protoc-gen-validate/tests/harness" in any of:
$GOPATH/src/github.com/envoyproxy/go-control-plane/vendor/github.com/lyft/protoc-gen-validate/tests/harness (vendor tree)
/usr/lib/go-1.8/src/github.com/lyft/protoc-gen-validate/tests/harness (from $GOROOT)
$GOPATH/src/github.com/lyft/protoc-gen-validate/tests/harness (from $GOPATH)
vendor/github.com/lyft/protoc-gen-validate/tests/harness/executor/cases.go:14:2: cannot find package "github.com/lyft/protoc-gen-validate/tests/harness/cases/go" in any of:
$GOPATH/src/github.com/envoyproxy/go-control-plane/vendor/github.com/lyft/protoc-gen-validate/tests/harness/cases/go (vendor tree)
/usr/lib/go-1.8/src/github.com/lyft/protoc-gen-validate/tests/harness/cases/go (from $GOROOT)
$GOPATH/src/github.com/lyft/protoc-gen-validate/tests/harness/cases/go (from $GOPATH)
vendor/golang.org/x/net/http2/h2i/h2i.go:38:2: cannot find package "golang.org/x/crypto/ssh/terminal" in any of:
$GOPATH/src/github.com/envoyproxy/go-control-plane/vendor/golang.org/x/crypto/ssh/terminal (vendor tree)
/usr/lib/go-1.8/src/golang.org/x/crypto/ssh/terminal (from $GOROOT)
$GOPATH/src/golang.org/x/crypto/ssh/terminal (from $GOPATH)
vendor/google.golang.org/grpc/credentials/oauth/oauth.go:28:2: cannot find package "golang.org/x/oauth2" in any of:
$GOPATH/src/github.com/envoyproxy/go-control-plane/vendor/golang.org/x/oauth2 (vendor tree)
/usr/lib/go-1.8/src/golang.org/x/oauth2 (from $GOROOT)
$GOPATH/src/golang.org/x/oauth2 (from $GOPATH)
vendor/google.golang.org/grpc/credentials/oauth/oauth.go:29:2: cannot find package "golang.org/x/oauth2/google" in any of:
$GOPATH/src/github.com/envoyproxy/go-control-plane/vendor/golang.org/x/oauth2/google (vendor tree)
/usr/lib/go-1.8/src/golang.org/x/oauth2/google (from $GOROOT)
$GOPATH/src/golang.org/x/oauth2/google (from $GOPATH)
vendor/google.golang.org/grpc/credentials/oauth/oauth.go:30:2: cannot find package "golang.org/x/oauth2/jwt" in any of:
$GOPATH/src/github.com/envoyproxy/go-control-plane/vendor/golang.org/x/oauth2/jwt (vendor tree)
/usr/lib/go-1.8/src/golang.org/x/oauth2/jwt (from $GOROOT)
$GOPATH/src/golang.org/x/oauth2/jwt (from $GOPATH)
vendor/google.golang.org/grpc/examples/helloworld/mock_helloworld/hw_mock.go:7:2: cannot find package "github.com/golang/mock/gomock" in any of:
$GOPATH/src/github.com/envoyproxy/go-control-plane/vendor/github.com/golang/mock/gomock (vendor tree)
/usr/lib/go-1.8/src/github.com/golang/mock/gomock (from $GOROOT)
$GOPATH/src/github.com/golang/mock/gomock (from $GOPATH)
Makefile:34: recipe for target 'build' failed
make: *** [build] Error 1
Thank you for the package but probably lately listener.Filter.Config
https://godoc.org/github.com/envoyproxy/go-control-plane/envoy/api/v2/listener#Filter has been deprecated in favor of listener.Filter.ConfigType
which is probably of type types.Any
where types
is "github.com/gogo/protobuf/types"
. I tried converting &envoyhcm.HttpConnectionManager{}
using types.MarshalAny()
but it seems not working, my question is how to use an instance of HttpConnectionManager
to be compliant with the new type listener.Filter.ConfigType
?
At the basic level, the management server provides a streaming gRPC server that implements xDS APIs. This server accepts connections from Envoys, and streams configuration updates to all proxies.
The server has to keep track of (version_info, nonce)
for each connected Node
, as well as the type of the requested resource (routes, listeners, clusters, or multiplexed through a single ADS stream). A dedicated go-routine will be used for each tuple (Node, resource_type)
. These go-routines can function in parallel. Each go-routine maintains (version_info, nonce)
state through request and response cycles.
The job of each routine is to drive the desired configuration in the remote proxies. The routine uses version and nonce to determine whether the configuration is successfully applied. If it's not applied, the routine attempts to apply it with retries.
The desired configuration is an input to each routine or the pool of routines. This can be an individual resource instance (EDS, CDS, RDS, or LDS), or a batch of resources to permit staging coordinated updates (TBD in another issue).
As a first cut, we can stand up a server that hosts a pool of go-routines and manages them (by spinning out on connection request, allocating state for each, and establishing channels for communication).
If envoy rejects a config, the snapshot cache will attempt to push the same config immediately after envoy requests it again. If envoy does not limit its requests, the server will drive envoy into a loop of request-receive-reject, potentially causing unnecessary CPU load.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.