Comments (8)
Neat! This sounds like it's Go trying to protect programs from hitting up against OS limits. I wonder if you only need to use sysctl to set a limit higher or if it's something we have to work around.
Approximately how often is this happening?
You're running this on FreeBSD, right?
from chihaya.
Hi @jzelinskie
Currently about every 4 days or so. Yes, FreeBSD 12.2.
I can't find a tunable that addresses the issue (so far).
root@explodie:~ # sysctl -a | grep 1048575
kern.maxfiles: 10485750
<last>1048575991</last>
<end>1048575991</end>
debug.nchash: 1048575
None of those apply.
Here is the go source code line we hit :D
https://go.googlesource.com/go/+/68e28998d7f094e70cef7ec0bef9fabfa9e17d07/src/internal/poll/fd_mutex.go#18
from chihaya.
This leads me to think there's some kind of bottleneck occurring.
You should have prometheus metrics on the number of goroutines, that'll help us see if we're leaking goroutines or the server is incapable of keeping up with the current traffic.
from chihaya.
Yep, current location for that is here: http://184.105.151.166:6880/
I am seeing things that are interesting. One is that the number of OS threads is (apparently) static at 35. The other is that goroutines are oscillating wildly (as expected, I think) but the numbers are much higher than I remember. I remember 10-30k, and I am seeing variations from 70k to 175k goroutines. Sitting here mashing f5 on the prometheus page, I am seeing a (current) floor around 53k goroutines and a peak of 220k or so. Current chihaya (top) SIZE is ~12GB.
from chihaya.
OK, checking again, it is down near 500-600 goroutines (which sounds like "idling" LOL). So, yes I am thinking somehow it is getting backed up.
from chihaya.
haha, peak at over 200k goroutines :D yeah that's probably not optimal. I do wonder where they're hung -- maybe waiting for storage? Can you calculate the rate of announces/s for reference? The storage implementation (are you using the default one?) is sharded to allow concurrent access -- you could try increasing the number of shards, maybe that helps? Do we have prometheus instrumentation for the storage? if yes, you could compare the number of storage operations/s to the number of announces/s, but I'm not sure we have it.
Where else could it be stuck.. sending packets? Do you have less upload bandwidth than download?
As another solution, we could limit the number of goroutines, but that would also limit performance. If your CPU/memory/network interfaces are not limiting, I don't think we should.
from chihaya.
Hi @mrd0ll4r,
Yes, using default storage engine with 1024 shards (which I think is the "default"). I do not store any instrumentation, but it is available if someone else wants to profile it that way.
Server has 1GBIT ethernet drop in a datacenter, I definitely get linespeed.
I think a cap would result in ... backing up the announces into kernel buffer, and dumping them if that overflows? That is better than crashing, but QoS is still pretty subpar LOL.
Unsure of the announces per second, none of the normal (seed/peer/hash) numbers are out of whack when this happens. but 175k goroutines...
(I think something is tickling it.)
from chihaya.
I'm not too familiar with Go, but I have seen other Go projects like Tyk mention needing to raise a value via ulimit
in their docs. It mentioned a hard and soft limit, one of those was important to raise. As that project is an API gateway intended to take on many connections/requests, it may be similar to what you're experiencing here.
I don't know much about ulimit
myself atm, but AFAIK it is different than a similar setting I'm used to setting for apps like VSCode/git that required raising default file watchers (inotify) limit much higher. ulimit
is similar but for processes instead of files I think?
from chihaya.
Related Issues (20)
- Explore serving UDP without standard lib HOT 3
- Bencoded dictionaries' keys are not always sorted HOT 5
- Explore using freecache for memory storage HOT 1
- Load config from dotfiles & create default config HOT 4
- Migrate chat off freenode HOT 2
- Explore replacing net.IP with inet.af/netaddr.IP HOT 2
- Explore using shoco for peer compression HOT 2
- Generational Garbage Collection
- State of Chihaya? No release since 2016? Will a new release be published? HOT 3
- Adopt goreleaser
- Change storage interface to take a slice of peers as input
- Does chihaya not support redis cluster mode? HOT 2
- How to read metrics number? HOT 2
- Explore garbage collection using golang.org/x/exp/maps
- How to connect to redis with unix domain socket? HOT 2
- Ideas regarding limiting concurrency and reducing allocations
- Data Race in UDP Frontend
- Use cases for peer-to-peer cloud software deployments HOT 3
- Investigate zeropool
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from chihaya.