Giter VIP home page Giter VIP logo

Comments (16)

nhooyr avatar nhooyr commented on August 26, 2024

Oh no :(

Can you post a goroutine trace of the leaked goroutines?

from websocket.

marekmartins avatar marekmartins commented on August 26, 2024

maybe You can give me some pointers how I could do it easiest and in a form most useful for You?
then I try to look more into it in next days

from websocket.

nhooyr avatar nhooyr commented on August 26, 2024

If you send your program a SIGQUIT it should dump a stack of all active goroutines. That would help lots.

If you cannot do that, you can use delve to attach to a running go program and get a trace of all goroutines.
See go-delve/delve#1475

I think net/http/pprof also supports a debug endpoint that prints a trace of all goroutines.

All equally useful options, please pick whichever is most convenient for you and get back to me as soon as possible.

from websocket.

nhooyr avatar nhooyr commented on August 26, 2024

Also just to note we do ensure there are no hanging goroutines after running all tests. So whatever it is, it's something not coming up in the tests.

How do you know it's a goroutine leak actually? Could just be some memory being held somewhere indefinitely somehow.

from websocket.

nhooyr avatar nhooyr commented on August 26, 2024

Actually idk how that could be either. Def need a trace or further info to debug this.

from websocket.

marekmartins avatar marekmartins commented on August 26, 2024

Tnx. I'll try to look into this.

I know this by Prometheus/Grafana metrics we have / are tracking.
Each night connections count drops from 50k to roughly 2k and together with it goroutines count aswell related to connections + related memory.

With newest patch version this drop was not there like it should be, it was pretty much flatlined.

EDIT
I deployed it to our test environment, but there is load so low that it takes litte time to get anything meaningful from there.
I will profile application there in next days and take goroutine profile using pprof and share file with You.

from websocket.

marekmartins avatar marekmartins commented on August 26, 2024

I let it run in test litte-bit and I already see patterns that goroutines tend to go up. I profiled it in a moment where there were 8 connections opened. We track it via Prometheus metrics and it's very precide.

I added part of goroutines profile here. I think for some reasons timeoutLoop goroutine does not always exit when connection is actually closed or somehow it get's stuck. It shows there 102, what seems to be way too off compared to 8 connections by other metrics. Also at night this number did not decrease.

I let it run more and will see if that number goes more up. On Monday I get newer numbers.

image

from websocket.

nhooyr avatar nhooyr commented on August 26, 2024

What kind of goroutines profile is that? What do the percentages mean?

It would help tons if you could get the stacks of those timeoutLoop goroutines. I looked over the code and it isn't clear how it could be getting stuck

from websocket.

marekmartins avatar marekmartins commented on August 26, 2024

I'm using net/http/pprof package for that I have HTTP handlers what give back different profiles.
This concrete one is /debug/pprof/goroutine profile from that.

https://pkg.go.dev/net/http/pprof

I took new profile and now it's 123 with no actual connections.
image

So it's steadily growing in test environment.

I added latest prof file as zip. Unzip it and check it via command:
go tool pprof -http=:8080 test.cmh1-3.goroutine.prof

Then You can check it out via browser. I can also sent that snapshot from other profile types like heap, mutex etc. All what is supported by pprof package.

Let me know 🙏

Also next week I will debug timeoutLoop and try to see if I'm able to repeat the situation when those goroutines will not exit.

test.cmh1-3.goroutine.zip

from websocket.

nhooyr avatar nhooyr commented on August 26, 2024

It's confusing but there are multiple goroutine profiles. Can you go to /debug/pprof/ in a browser and then click full goroutine stack dump? That's the one I need and it's just plaintext. The link should be /debug/pprof/goroutine?debug=2.

from websocket.

nhooyr avatar nhooyr commented on August 26, 2024

One possible scenario in which I can see this behavior arising is if you're not calling c.Close/c.CloseNow on every connection after you're done with it. Before if an error occured on the connection like a read/write it would be automatically closed and cleaned up for you but this latest version removed that in favor of always requiring c.Close/c.CloseNow.

Though the docs have always read:

Be sure to call Close on the connection when you are finished with it to release associated resources.

So this behavior was never to be relied on. Though I can see how that's confusing with the next line:

On any error from any method, the connection is closed with an appropriate reason.

I'll remove this line from the docs.

from websocket.

nhooyr avatar nhooyr commented on August 26, 2024

The quoted docs are from: https://godocs.io/nhooyr.io/websocket#Conn

from websocket.

marekmartins avatar marekmartins commented on August 26, 2024

You're right. If connection was closed from client side and main read loop ended because of that then on server side we did not explicitly close as without it also it worked and released resources on it's own.
If that's the case then sorry for the trouble 😞. I will add Close next week in this case as well and let here know that if it resolved the issue.

Also as maybe I'm not the only one and previous behavior was not intentional, but it still was there then would be wise to mention it in release notes also?
For me it seems important behavioral change to mention 😅

from websocket.

nhooyr avatar nhooyr commented on August 26, 2024

Awesome! Yup will add to the release notes thanks @marekmartins.

from websocket.

marekmartins avatar marekmartins commented on August 26, 2024

Letting know that seems that adding explicit Close solved problem for us! Thank You :) Feel free to close this issue.

from websocket.

nhooyr avatar nhooyr commented on August 26, 2024

Awesome good to hear!

from websocket.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.