Giter VIP home page Giter VIP logo

Comments (12)

enobufs avatar enobufs commented on June 3, 2024 1

..it only delays the problem

True, if that is a blocking channel. TCP listener has a finite size of backlog to control the rate of the incoming connection. If there is such a large amount of incoming connection on an acceptor, I would consider dropping some of them. E.g. DoS attack. (imagining pions/sctp alone is used for something else, etc). The unlock/lock solution might be fragile when I know a single DATA chunk with different streanIndetifier could easily block the goroutine calling handleData() with the current implementation, I think (matching TSN, SN might be hard however). So, I am leaning toward:

  • use buffered channel for acceptCh (find the reasonable/sufficient size, or make it configurable)
  • don't block on write (use select-default) on acceptCh
  • If write failed (the channel is full), silently discard the DATA chunk (likely the channel-open) and do not even register the stream with a.streams map.

Thoughts?

from sctp.

backkem avatar backkem commented on June 3, 2024 1

Ok, that could work. Normally, I don't like select-default because it will trigger the default case if nothing is listening on the other channel at the exact same time. However, since we're using a buffered channel that's not really a problem.

from sctp.

hugoArregui avatar hugoArregui commented on June 3, 2024 1

For what is worth, I can confirm that buffering the acceptCh channel fixed the problem for me.

from sctp.

Sean-Der avatar Sean-Der commented on June 3, 2024

I think we need to handle closing here

I will write a test, but if we get a new stream we will deadlock when shutting down.

from sctp.

enobufs avatar enobufs commented on June 3, 2024

Might be related: pion/webrtc#578
I will give a close look tonight.

from sctp.

enobufs avatar enobufs commented on June 3, 2024

I think I found the cause! (@Sean-Der I ended up with the line you already pointed out :) Unlocking before writing to acceptCh is necessary!)

	if accept {
		a.lock.Unlock()
		a.acceptCh <- s
		a.lock.Lock()
	}

from sctp.

enobufs avatar enobufs commented on June 3, 2024

I closed the PR (#34) because the dead-lock still occurred with the modification. I looked into it further, and seeing offerrer side is calling OpenStream() twice, first is on CreateDataChannel and the second is after AcceptStream() returns.

It's hard to read with a bunch of log messages, but you can see two stack traces below printed right before datachannel.Dial() call from same "offerer".

DBG(0xc0005f00e0) datachannel.Dial() !!!
DBG(0xc0004e60e0): AcceptStream() - in
accept-loop: pt1
goroutine 16 [running]:
runtime/debug.Stack(0x29, 0x0, 0x0)
	/Users/ytakeda/go/current/src/runtime/debug/stack.go:24 +0x9d
runtime/debug.PrintStack()
	/Users/ytakeda/go/current/src/runtime/debug/stack.go:16 +0x22
github.com/pions/webrtc.(*DataChannel).open(0xc0004d00c0, 0xc0004a48a0, 0xc000241e38, 0xc0005a88a0)
	/Users/ytakeda/Projects/pions/webrtc/datachannel.go:140 +0x185
github.com/pions/webrtc.(*PeerConnection).openDataChannels(0xc000001b00)
	/Users/ytakeda/Projects/pions/webrtc/peerconnection.go:928 +0xbd
github.com/pions/webrtc.(*PeerConnection).SetRemoteDescription.func2(0xc0004a4801, 0xc000001b00, 0xc00051eaca, 0x10, 0xc0005320c8, 0x20, 0xc00023e88c, 0x7, 0xc00023e894, 0x5f)
	/Users/ytakeda/Projects/pions/webrtc/peerconnection.go:919 +0x51b
created by github.com/pions/webrtc.(*PeerConnection).SetRemoteDescription
	/Users/ytakeda/Projects/pions/webrtc/peerconnection.go:852 +0xd86
DBG(0xc0005f00e0): OpenStream() si=0
DBG(0xc0005f00e0): sendPayloadData() - in
DBG(0xc0005f00e0): AcceptStream() - in
DBG(0xc0005f00e0): sendPayloadData() - out
DBG(0xc0005f00e0) datachannel.Dial() !!!
DBG(0xc0004e60e0): NEW STREAM(0) TO ACCEPT - in
DBG(0xc0004e60e0): NEW STREAM(0) TO ACCEPT - out
goroutine 16 [running]:
runtime/debug.Stack(0x29, 0x0, 0x0)
	/Users/ytakeda/go/current/src/runtime/debug/stack.go:24 +0x9d
runtime/debug.PrintStack()
	/Users/ytakeda/go/current/src/runtime/debug/stack.go:16 +0x22
github.com/pions/webrtc.(*DataChannel).open(0xc0004d0180, 0xc0004a48a0, 0x0, 0x0)
	/Users/ytakeda/Projects/pions/webrtc/datachannel.go:140 +0x185
github.com/pions/webrtc.(*PeerConnection).openDataChannels(0xc000001b00)
	/Users/ytakeda/Projects/pions/webrtc/peerconnection.go:928 +0xbd
github.com/pions/webrtc.(*PeerConnection).SetRemoteDescription.func2(0xc0004a4801, 0xc000001b00, 0xc00051eaca, 0x10, 0xc0005320c8, 0x20, 0xc00023e88c, 0x7, 0xc00023e894, 0x5f)
	/Users/ytakeda/Projects/pions/webrtc/peerconnection.go:919 +0x51b
created by github.com/pions/webrtc.(*PeerConnection).SetRemoteDescription
	/Users/ytakDBG(0xc0004e60e0): AcceptStream() - out, res=true
eda/Projects/pions/webrtc/peerconnection.go:852 +0xd86
accept-loop: pt1.1
accept-loop: pt1.2
accept-loop: pt1.2.1
accept-loop: pt1.2.2
accept-loop: pt1.2.3
accept-loop: pt1.2.4
accept-loop: pt1.2.5
DBG(0xc0004e60e0): sendPayloadData() - in
DBG(0xc0005f00e0): sendPayloadData() - in
DBG(0xc0005f00e0): OpenStream() si=2
DBG(0xc0004e60e0): sendPayloadData() - out
DBG(0xc0005f00e0): sendPayloadData() - in
accept-loop: pt1.2.6
accept-loop: pt1.4
accept-loop: pt2
accept-loop: pt3
accept-loop: pt4
accept-loop: pt5
accept-loop: pt1
DBG(0xc0004e60e0): AcceptStream() - in
DBG(0xc0005f00e0): sendPayloadData() - out
DBG(0xc0005f00e0): sendPayloadData() - out
DBG(0xc0004e60e0): sendPayloadData() - in
DBG(0xc0004e60e0): NEW STREAM(2) TO ACCEPT - in
DBG(0xc0004e60e0): NEW STREAM(2) TO ACCEPT - out
DBG(0xc0004e60e0): AcceptStream() - out, res=true

Above log was the successful case. When dead lock heppens, I can see the the sctp's goroutine (the one calling handleData()) is block on accessCh (to write), while the data channel's goroutine (the one calls AcceptStream and creates a new Server) is blocked on writeDataChannelAck(), which calls sctp.WriteSCTP().

I will look into this more.

from sctp.

backkem avatar backkem commented on June 3, 2024

I still feel this holds based on what you said (from PR):

  • There is a new stream incoming.
  • Nothing is listening on acceptCh since the caller of AcceptStream is still processing the previous incoming stream (it tries to WriteSCTP) .
  • Processing of the previous incoming stream is blocked because createStream holds the SCTP lock while sending on acceptCh.

Regarding the current solution: This does have the disadvantage that it breaks up the atomicity of handleChunk. Maybe this can lead to unexpected behavior?

Possible alternatives:

  • Send on acceptCh asynchronously. This does mean the order of opening channels is lost. That doesn't seem like a big loss thought.
  • Buffer the incoming streams and notify AcceptStream asynchronously, similar to readNotifier.

from sctp.

enobufs avatar enobufs commented on June 3, 2024

@backkem I said earlier, someone was listening on AcceptStream() when dead lock occurred, but above log is the case I saw earlier I think. None is waiting on acceptCh in the above case. There seem to be more than one pattern.

from sctp.

enobufs avatar enobufs commented on June 3, 2024

I think we have to find out why OpenStream() is called twice. (I am too tired now. will do more tomorrow)

from sctp.

enobufs avatar enobufs commented on June 3, 2024

Thanks for your notes by the way. Yes, your observation is correct I think. We could use buffered channel also - just like TCP's listening socket's "backlog"...

from sctp.

backkem avatar backkem commented on June 3, 2024

The only problem I have with a buffered channel is that it only delays the problem. If multiple new streams build up there becomes a point where the exact same deadlock problem could occur. With my first suggestion you basically create an 'unlimited' buffer (due to spawning new goroutines). In the second case we can explicitly handle buffer overflow in the way we want.
Looking into duplicate OpenStream calls could be interesting as well. Potentially the stream is being closed right before a lagging packet arrives that tries to re-create the stream again.

from sctp.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.