Giter VIP home page Giter VIP logo

Comments (8)

jzelinskie avatar jzelinskie commented on May 25, 2024

Neat! This sounds like it's Go trying to protect programs from hitting up against OS limits. I wonder if you only need to use sysctl to set a limit higher or if it's something we have to work around.

Approximately how often is this happening?
You're running this on FreeBSD, right?

from chihaya.

Aranjedeath avatar Aranjedeath commented on May 25, 2024

Hi @jzelinskie

Currently about every 4 days or so. Yes, FreeBSD 12.2.

I can't find a tunable that addresses the issue (so far).

root@explodie:~ # sysctl -a | grep 1048575
kern.maxfiles: 10485750
        <last>1048575991</last>
            <end>1048575991</end>
debug.nchash: 1048575

None of those apply.

Here is the go source code line we hit :D
https://go.googlesource.com/go/+/68e28998d7f094e70cef7ec0bef9fabfa9e17d07/src/internal/poll/fd_mutex.go#18

from chihaya.

jzelinskie avatar jzelinskie commented on May 25, 2024

This leads me to think there's some kind of bottleneck occurring.
You should have prometheus metrics on the number of goroutines, that'll help us see if we're leaking goroutines or the server is incapable of keeping up with the current traffic.

from chihaya.

Aranjedeath avatar Aranjedeath commented on May 25, 2024

Yep, current location for that is here: http://184.105.151.166:6880/

I am seeing things that are interesting. One is that the number of OS threads is (apparently) static at 35. The other is that goroutines are oscillating wildly (as expected, I think) but the numbers are much higher than I remember. I remember 10-30k, and I am seeing variations from 70k to 175k goroutines. Sitting here mashing f5 on the prometheus page, I am seeing a (current) floor around 53k goroutines and a peak of 220k or so. Current chihaya (top) SIZE is ~12GB.

from chihaya.

Aranjedeath avatar Aranjedeath commented on May 25, 2024

OK, checking again, it is down near 500-600 goroutines (which sounds like "idling" LOL). So, yes I am thinking somehow it is getting backed up.

from chihaya.

mrd0ll4r avatar mrd0ll4r commented on May 25, 2024

haha, peak at over 200k goroutines :D yeah that's probably not optimal. I do wonder where they're hung -- maybe waiting for storage? Can you calculate the rate of announces/s for reference? The storage implementation (are you using the default one?) is sharded to allow concurrent access -- you could try increasing the number of shards, maybe that helps? Do we have prometheus instrumentation for the storage? if yes, you could compare the number of storage operations/s to the number of announces/s, but I'm not sure we have it.

Where else could it be stuck.. sending packets? Do you have less upload bandwidth than download?

As another solution, we could limit the number of goroutines, but that would also limit performance. If your CPU/memory/network interfaces are not limiting, I don't think we should.

from chihaya.

Aranjedeath avatar Aranjedeath commented on May 25, 2024

Hi @mrd0ll4r,

Yes, using default storage engine with 1024 shards (which I think is the "default"). I do not store any instrumentation, but it is available if someone else wants to profile it that way.

Server has 1GBIT ethernet drop in a datacenter, I definitely get linespeed.

I think a cap would result in ... backing up the announces into kernel buffer, and dumping them if that overflows? That is better than crashing, but QoS is still pretty subpar LOL.

Unsure of the announces per second, none of the normal (seed/peer/hash) numbers are out of whack when this happens. but 175k goroutines...

(I think something is tickling it.)

from chihaya.

polarathene avatar polarathene commented on May 25, 2024

I'm not too familiar with Go, but I have seen other Go projects like Tyk mention needing to raise a value via ulimit in their docs. It mentioned a hard and soft limit, one of those was important to raise. As that project is an API gateway intended to take on many connections/requests, it may be similar to what you're experiencing here.

I don't know much about ulimit myself atm, but AFAIK it is different than a similar setting I'm used to setting for apps like VSCode/git that required raising default file watchers (inotify) limit much higher. ulimit is similar but for processes instead of files I think?

from chihaya.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.