Giter VIP home page Giter VIP logo

Comments (5)

cenkalti avatar cenkalti commented on May 28, 2024

The health check mechanism is there for finding deadlocks in torrent loop and failing fast.

// checkTorrent pings the torrent run loop periodically and crashes the program if a torrent does not respond in
// specified timeout. This is not a good behavior for a production program but it helps to find deadlocks easily,
// at least while developing.
func (s *Session) checkTorrent(t *torrent) {

The loop in https://github.com/cenkalti/rain/blob/master/torrent/torrent_run.go must not ever block at any time. All IO operations is done in separate goroutines that send their result to the select block in torrent_run.go using channels.

When I look at the traceback you provided, I see this:

[air:Desktop]% cat rain-crash.txt| grep torrent_run.go
	/var/lib/jenkins/go/pkg/mod/github.com/bsergean/[email protected]/torrent/torrent_run.go:20 +0x7ca
	/var/lib/jenkins/go/pkg/mod/github.com/bsergean/[email protected]/torrent/torrent_run.go:20 +0x7ca
	/var/lib/jenkins/go/pkg/mod/github.com/bsergean/[email protected]/torrent/torrent_run.go:20 +0x7ca
	/var/lib/jenkins/go/pkg/mod/github.com/bsergean/[email protected]/torrent/torrent_run.go:86 +0xf85
	/var/lib/jenkins/go/pkg/mod/github.com/bsergean/[email protected]/torrent/torrent_run.go:20 +0x7ca
	/var/lib/jenkins/go/pkg/mod/github.com/bsergean/[email protected]/torrent/torrent_run.go:20 +0x7ca
	/var/lib/jenkins/go/pkg/mod/github.com/bsergean/[email protected]/torrent/torrent_run.go:20 +0x7ca
	/var/lib/jenkins/go/pkg/mod/github.com/bsergean/[email protected]/torrent/torrent_run.go:20 +0x7ca
	/var/lib/jenkins/go/pkg/mod/github.com/bsergean/[email protected]/torrent/torrent_run.go:20 +0x7ca

All torrents except one are waiting on line torrent_run.go:20, which is the select statement. That means, their torrent loop is ready to receive messages from outside and process it. This is how this health check mechanism determines the torrent is health and processing messages.

However, one loop is executing torrent_run.go:86. When I look the whole traceback of that goroutine, I see that it is in runnable state and printing a debug log statement. It must have been stuck for 60 seconds while trying to print that log statement. Underneath the logging library, it calls runtime.Caller() function to find the filename to inject into the log statement.

My guess is that there is too much load on the server so this goroutine might never found the chance to run.

Another guess would be the process is suspended for a duration longer than 60 seconds and resumed later. That's why health checker mechanism decided that the torrent is not health and crashes with the stack traces.

Any thoughts?

from rain.

cenkalti avatar cenkalti commented on May 28, 2024

If it is frequently the logging case, you can try to lower your log volume by setting the level to INFO or WARNING to reduce the chance of crash as workaround. I don't recommend using DEBUG level at production as it is producing too much log and that may slow down the execution of the program.

from rain.

bsergean avatar bsergean commented on May 28, 2024

from rain.

github-actions avatar github-actions commented on May 28, 2024

This issue is stale because it has been open for 30 days with no activity.

from rain.

github-actions avatar github-actions commented on May 28, 2024

This issue was closed because it has been inactive for 14 days since being marked as stale.

from rain.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.