Comments (3)
Looks like the problem is the number of connections Envoy is trying to establish. If we reduce the number of fortio servers (by removing a serve_port) then this looks better. I guess a "correct" fix is to have sensible limits on max_connections, max_pending_requests, and max_requests.
from heyp-agents.
Since the problem is overload between backends and Envoy, we can mitigate as noted in the previous comment. Not impacting any current results.
from heyp-agents.
Following up after seeing this again: the problem appears to CPU starvation in Envoy. One thing that has helped is enabling http2 communication between clients and envoy (envoy already spoke http2 to the backends, now all communication is cleartext http2).
Overall it seems like we are just barely touching this threshold where Envoy experiences CPU overloaded, and if we're slightly under things are OK.
Best thing would be to use bigger machines for Envoy. Unfortunately, only xl170 machines can use the user-allocatable switches on cloudlab. So to do this, we'd need to set up relay xl170 machines that NAT traffic to the actual machines over another network. However, having xl170 machines connected to two networks is blocked by a cloudlab issue.
from heyp-agents.
Related Issues (20)
- No debug logging at host agent with no limits HOT 1
- inc-nl no longer violates approval HOT 2
- Unexplained throughput drop over time HOT 2
- Unexplained zero rate limits HOT 1
- experiments/dc-sim: rate limit error is suspect
- Support downgrade using usage, not demand
- Shared state between FG downgrade selector?
- Retry OS calls in vfortio
- Ensure we reset vfortio state between runs. HOT 1
- Install and use newer ss binary. HOT 1
- Fix state management to use intended QoS for measuring HIPRI / LOPRI usage
- Plug in feedback controller to both cluster controllers HOT 1
- Check retransmission counting HOT 1
- Mark traffic destined for relay with correct cluster HOT 2
- Envoy admin interface port 0
- Invalid frac_lopri crashes cluster-agent
- Check that a stuck host agent doesn't get the cluster agent stuck HOT 1
- Longer LOPRI didn't work in a run
- Set job field for InitSimulatedWan
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from heyp-agents.