pion / sctp Goto Github PK
View Code? Open in Web Editor NEWA Go implementation of SCTP
Home Page: https://pion.ly/
License: MIT License
A Go implementation of SCTP
Home Page: https://pion.ly/
License: MIT License
Luke pointed out on slack:
unrelated, but if anybody wants an ez SCTP performance optimization: https://github.com/pion/sctp/blob/master/packet.go#L147
should be creating the crc32 Hash and calling `Write` over ranges, instead of allocating/copying every single packet we send/receive (edited)
Seems to be an easy fix/improvement.
Used the pion/webrtc library for the the proxy side of our snowflake anti-censorship system that reads from a WebRTC connection to the client. I used keroserene/go-webrtc as the library on the client side and found data being dropped despite the usage of a reliable channel.
I expected to receive all of the data that I sent
Large chunks of data were missing.
It turns out Stream is not returning io.ErrShortBuffer
errors from reassemblyQueue.read. Instead the error is being overwritten and the data is lost. Stream.ReadSCTP should instead return the error.
sctp DEBUG: 18:33:56.319559 association.go:2070: [0xc0000b6000] T3-rtx timed out: nRtos=58 cwnd=1228 ssthresh=4912
sctp DEBUG: 18:34:56.320670 association.go:2070: [0xc0000b6000] T3-rtx timed out: nRtos=59 cwnd=1228 ssthresh=4912
sctp DEBUG: 18:35:56.318587 association.go:2070: [0xc0000b6000] T3-rtx timed out: nRtos=60 cwnd=1228 ssthresh=4912
sctp DEBUG: 18:36:56.321205 association.go:2070: [0xc0000b6000] T3-rtx timed out: nRtos=61 cwnd=1228 ssthresh=4912
sctp DEBUG: 18:37:56.320873 association.go:2070: [0xc0000b6000] T3-rtx timed out: nRtos=62 cwnd=1228 ssthresh=4912
sctp DEBUG: 18:38:56.318899 association.go:2070: [0xc0000b6000] T3-rtx timed out: nRtos=63 cwnd=1228 ssthresh=4912
sctp DEBUG: 18:38:56.318946 association.go:2070: [0xc0000b6000] T3-rtx timed out: nRtos=64 cwnd=1228 ssthresh=4912
sctp DEBUG: 18:38:56.319016 association.go:2070: [0xc0000b6000] T3-rtx timed out: nRtos=65 cwnd=1228 ssthresh=4912
sctp DEBUG: 18:38:56.319042 association.go:2070: [0xc0000b6000] T3-rtx timed out: nRtos=66 cwnd=1228 ssthresh=4912
sctp DEBUG: 18:38:56.319121 association.go:2070: [0xc0000b6000] T3-rtx timed out: nRtos=67 cwnd=1228 ssthresh=4912
sctp DEBUG: 18:38:56.319203 association.go:2070: [0xc0000b6000] T3-rtx timed out: nRtos=68 cwnd=1228 ssthresh=4912
sctp DEBUG: 18:38:56.319282 association.go:2070: [0xc0000b6000] T3-rtx timed out: nRtos=69 cwnd=1228 ssthresh=4912
sctp DEBUG: 18:38:56.319356 association.go:2070: [0xc0000b6000] T3-rtx timed out: nRtos=70 cwnd=1228 ssthresh=4912
Use a data channel, then close a receiver abruptly (ctrl-c, etc) and leave the sender running for about an hour.
The sender should keep retransmitting on T3 timeout at the maxRTO = 60 sec. Or, ICE should detect the disconnection and the sender should also disconnect. (the former is the bug to address, and the latter is a bug in my app.)
After the 63rd timeout, the retransmission interval becomes 0, causing 100% CPU usage.
I found the cause. The T3 timer interval doubles every time the timeout occurs up until the interval hits max RTO which is 60 sec. The current code correctly caps the interval at 60sec, but it still internally keeps doubling the interval, using shift-left operator, causing the result being 0 after its 64-bit width exhausts, end up with the interval being 0 from the 64th timer.
See mail thread.
handleChunk
iterates each chunk and calls c.check
this function returns if we should abort or not.
Right now we ignore the return value and just print. In the future we need to properly handle, and send an ABORT instead of just logging.
Haven't really dug into this yet, but it looks like we hold a lock when when waiting for accept (so we can deadlock the association this way)
Full build https://travis-ci.org/pions/webrtc/builds/513830206?utm_source=github_status&utm_medium=notification
0x46a6ee sync.runtime_notifyListWait+0xce /home/travis/go/src/runtime/sema.go:510
0x48db5d sync.(*Cond).Wait+0x8d /home/travis/go/src/sync/cond.go:56
0x74bac1 github.com/pions/sctp.(*Stream).ReadSCTP+0x181 /home/travis/gopath/pkg/mod/github.com/pions/[email protected]/stream.go:109
0x751a1a github.com/pions/datachannel.(*DataChannel).ReadDataChannel+0x9a /home/travis/gopath/pkg/mod/github.com/pions/[email protected]/datachannel.go:132
0x9eb069 github.com/pions/webrtc.(*DataChannel).readLoop+0xe9 /home/travis/build/pions/webrtc/datachannel.go:262
0x729e32 github.com/pions/sctp.(*Association).createStream+0x1272 /home/travis/gopath/pkg/mod/github.com/pions/[email protected]/association.go:598
0x729a28 github.com/pions/sctp.(*Association).getOrCreateStream+0xe68 /home/travis/gopath/pkg/mod/github.com/pions/[email protected]/association.go:610
0x72995d github.com/pions/sctp.(*Association).handleData+0xd9d /home/travis/gopath/pkg/mod/github.com/pions/[email protected]/association.go:522
0x730aa4 github.com/pions/sctp.(*Association).handleChunk+0x1df4 /home/travis/gopath/pkg/mod/github.com/pions/[email protected]/association.go:1147
0x7271a5 github.com/pions/sctp.(*Association).handleInbound+0x245 /home/travis/gopath/pkg/mod/github.com/pions/[email protected]/association.go:347
0x726ce7 github.com/pions/sctp.(*Association).readLoop+0x117 /home/travis/gopath/pkg/mod/github.com/pions/[email protected]/association.go:317
0x6de226 github.com/pions/dtls.(*Conn).Read+0x86 /home/travis/gopath/pkg/mod/github.com/pions/[email protected]/conn.go:178
0x726c9b github.com/pions/sctp.(*Association).readLoop+0xcb /home/travis/gopath/pkg/mod/github.com/pions/[email protected]/association.go:312
0x75122e github.com/pions/sctp.(*Association).AcceptStream+0x6e /home/travis/gopath/pkg/mod/github.com/pions/[email protected]/association.go:579
0x7511ef github.com/pions/datachannel.Accept+0x2f /home/travis/gopath/pkg/mod/github.com/pions/[email protected]/datachannel.go:73
0xa12a1c github.com/pions/webrtc.(*SCTPTransport).acceptDataChannels+0x10c /home/travis/build/pions/webrtc/sctptransport.go:133
0x4694bc sync.runtime_SemacquireMutex+0x3c /home/travis/go/src/runtime/sema.go:71
0x48f468 sync.(*Mutex).Lock+0x148 /home/travis/go/src/sync/mutex.go:134
0x490699 sync.(*RWMutex).Lock+0x49 /home/travis/go/src/sync/rwmutex.go:93
0x72df82 github.com/pions/sctp.(*Association).sendPayloadData+0x52 /home/travis/gopath/pkg/mod/github.com/pions/[email protected]/association.go:910
0x74c045 github.com/pions/sctp.(*Stream).WriteSCTP+0x195 /home/travis/gopath/pkg/mod/github.com/pions/[email protected]/stream.go:168
0x7523eb github.com/pions/datachannel.(*DataChannel).writeDataChannelAck+0xab /home/travis/gopath/pkg/mod/github.com/pions/[email protected]/datachannel.go:227
0x751619 github.com/pions/datachannel.Server+0x2c9 /home/travis/gopath/pkg/mod/github.com/pions/[email protected]/datachannel.go:115
0x75126f github.com/pions/datachannel.Accept+0xaf /home/travis/gopath/pkg/mod/github.com/pions/[email protected]/datachannel.go:80
0xa12a1c github.com/pions/webrtc.(*SCTPTransport).acceptDataChannels+0x10c /home/travis/build/pions/webrtc/sctptransport.go:133
0xa23c7b github.com/pions/webrtc.TestPeerConnection_Close+0x2db /home/travis/build/pions/webrtc/peerconnection_close_test.go:46
0x545143 testing.tRunner+0x163 /home/travis/go/src/testing/testing.go:865
I found it is difficult to use pion/sctp alone because sctp.Association.Client and sctp.Association.Server take a "connected" UDP. These are equivalent to a child (server) socket and a client socket of TCP. What's missing in pion/sctp s "listening socket".
In the context of WebRTC, we use Client-Client simultaneous open, which is not practical in the cases of non-WebRTC (no-ICE, etc). Client-Server is possible, but it would be difficult to establish connection if there's a NAT in between.
Introduce the usability of pion/sctp standalone in TCP style.
I have written a tool that enables TCP like listening socket capability. However, it cannot take advantage of "cookie-echo" (resembles to TCP-Syn-cookie) SCTP offers, which is a measure against DDoS attack (like spoofed SYN-flood in TCP) because internal chunk parsers are not exposed via its API.
We can use go-rudp as a reference.
It appears that SCTP is reading incoming data slow in a high bandwidth usage, causing a buffer (packetio) at ICE layer being filled up with a lot of incoming data (cause a lot of latency before reaching SCTP layer) - pion/ice#12.
(Excerpt from the discussion in the PR #39)
syscall.pdf
Looking at the trace output (go tool trace), it appears, what association's readLoop is blocked by is not the mutex of stream layer, but the syscall "sendto". (underlying (UDP) socket is a blocking socket)
To solve the "slow-reader" problem, we'd probably need a drastic change in SCTP, such as stop sending data (a reply) on handleInbound(), and use another goroutine for the immediate replies (control packets), which would raise another issue - how much can sctp layer buffer those immediate replies?
Related issue: pion/ice#12 and #32
writeLoop
in association layerwriteLoop
and return immediately.writeLoop
to send it, and exit the routine as soon as possible (similar to delayed-ack, but immediately)TODO: evaluate if the above proposal is viable first.
This issue is a split from #11.
Data channel connectivity issue has become very critical, I'd like to roll out T1-init/cookie timers earlier that T3-rtx/congestion control features.
Retransmission of RECONFIG chunk has not been implemented yet.
This is crucial when the application expects to receive EOF at the end, over a lossy connection.
A workaround to this is to implement end-of-file signaling at the application level.
Relates to pion/webrtc#652
Copy the configuration from pions/webrtc.
Copy the style from pions/webrtc including readme, license, ... .
This may not be a problem in sctp, maybe there is a bug on the example or I'm missing something but since I have a reproducible case I think it's interesting to document it.
The code is in here: https://github.com/hugoArregui/testsctp/tree/topic-ordered-problem
I see this on the receiver end:
2019/09/26 12:43:31 Received Mbps: 19.459, totalBytesReceived: 2551268
sctp DEBUG: 12:43:31.348204 association.go:967: [0xc0000a5380] receive buffer full. dropping DATA with tsn=3502769449
2019/09/26 12:43:32 Received Mbps: 9.736, totalBytesReceived: 2552496
2019/09/26 12:43:33 Received Mbps: 6.491, totalBytesReceived: 2552496
sctp DEBUG: 12:43:33.348266 association.go:967: [0xc0000a5380] receive buffer full. dropping DATA with tsn=3502769449
2019/09/26 12:43:34 Received Mbps: 4.870, totalBytesReceived: 2553724
2019/09/26 12:43:35 Received Mbps: 3.896, totalBytesReceived: 2553724
2019/09/26 12:43:36 Received Mbps: 3.247, totalBytesReceived: 2553724
2019/09/26 12:43:37 Received Mbps: 2.783, totalBytesReceived: 2553724
sctp DEBUG: 12:43:37.348694 association.go:967: [0xc0000a5380] receive buffer full. dropping DATA with tsn=3502769449
2019/09/26 12:43:38 Received Mbps: 2.437, totalBytesReceived: 2554952
2019/09/26 12:43:39 Received Mbps: 2.166, totalBytesReceived: 2554952
And this on the sender end:
2019/09/26 12:43:31 Sent Mbps: 19.461, totalBytesSent: 2551268, bufferedAmout: 557056
sctp DEBUG: 12:43:31.347734 association.go:2011: [0xc00009f380] T3-rtx timed out: nRtos=1 cwnd=1228 ssthresh=41486
sctp DEBUG: 12:43:31.347962 association.go:518: [0xc00009f380] retransmitting 1228 bytes
2019/09/26 12:43:32 Sent Mbps: 9.736, totalBytesSent: 2552496, bufferedAmout: 557056
2019/09/26 12:43:33 Sent Mbps: 6.491, totalBytesSent: 2552496, bufferedAmout: 557056
sctp DEBUG: 12:43:33.348043 association.go:2011: [0xc00009f380] T3-rtx timed out: nRtos=2 cwnd=1228 ssthresh=4912
sctp DEBUG: 12:43:33.348124 association.go:518: [0xc00009f380] retransmitting 1228 bytes
2019/09/26 12:43:34 Sent Mbps: 4.871, totalBytesSent: 2553724, bufferedAmout: 557056
2019/09/26 12:43:35 Sent Mbps: 3.897, totalBytesSent: 2553724, bufferedAmout: 557056
2019/09/26 12:43:36 Sent Mbps: 3.247, totalBytesSent: 2553724, bufferedAmout: 557056
2019/09/26 12:43:37 Sent Mbps: 2.783, totalBytesSent: 2553724, bufferedAmout: 557056
sctp DEBUG: 12:43:37.348328 association.go:2011: [0xc00009f380] T3-rtx timed out: nRtos=3 cwnd=1228 ssthresh=4912
sctp DEBUG: 12:43:37.348477 association.go:518: [0xc00009f380] retransmitting 1228 bytes
I think for some reason the ressembly queue cannot be read (missing some package or there is a bug somewhere) that causes the buffer to be always full which causes the sender to halt.
https://tools.ietf.org/html/rfc6525
Required by pion/datachannel#4.
Am I reading RFC 4960 wrong?
diff --git a/association.go b/association.go
index a65188d..b77daff 100644
--- a/association.go
+++ b/association.go
@@ -729,7 +729,8 @@ func (a *Association) handleChunk(p *packet, c chunk) ([]*packet, error) {
a.setState(cookieEchoed)
return pack(r), nil
default:
- return nil, errors.Errorf("TODO Handle Init acks when in state %s", a.state.String())
+ // RFC 4960 Section 5.2.3
+ return nil, nil
}
case *chunkAbort:
@@ -778,7 +779,8 @@ func (a *Association) handleChunk(p *packet, c chunk) ([]*packet, error) {
close(a.handshakeCompletedCh)
return nil, nil
default:
- return nil, errors.Errorf("TODO Handle Init acks when in state %s", a.state.String())
+ // RFC 4960 Section 5.2.5
+ return nil, nil
}
// TODO Abort
Currently if we receive an ABORT
chunk we print it and continue.
We need to properly tear down the association and notify the user.
A debug log is broken:
sctp DEBUG: 15:18:18.348686 association.go:434: [0xc00053a340] readLoop exited EOF %!s(MISSING)
WebRTC's datachannel.Write() does not block as we follow JavaScript WebRTC API. When sending a large amount of data, it is the application's responsibility to check the buffered amount (in SCTP layer for sending).
This is pretty standard in JavaScript land, but this really does not align with a typical semantics of Go. (i.e. net.Conn). Also, implementing a flow control at the user level IS tedious and not too trivial to get it right and error-prone.
We, I believe, should keep the current behavior with pion/webrtc API (maintaining JavaScript API semantics as a policy), but we could make some exceptions to it in the following cases:
In these cases, blocking Write() method can be the default behavior, and turn off the blocking when used with pion/webrtc non-detached, etc.
What if we add a default implementation to SCTP itself? E.g. default threshold and if you pass it stream.Write starts blocking.
If you overwrite, you're on your own.
...
Yea so, if you use SCTP or DataChannel directly -> Default blocking implementation (blocking by default seems rather idiomatic).
If you use WebRTC -> We overwrite the OnBufferedAmountLow and it disables the default implementation and otherwise confirms to the WebRTC spec.
that default implementation will suit 99% of people's needs
"block if exceed buffer size"
I totally agree with the above comments, and I think SCTP layer should take care of this.
Sending many 32GB messages over data channel with unordered and max retransmits set to 0. Forcibly dropping packets at 4% ratio.
Pion to continue receiving messages despite some message loss.
Pion stop receiving, sending SACK with a_rwnd=0.
During the work of #104, I have learned that when the message was sent with U (unordered) bit set to 1 and later abandoned, Forward TSN will be sent. Current Pion expects that the Foward TSN chunk will include stream ID, but @tuexen pointed out that the stream ID wouldn't be included. I have confirmed that both Chrome and Firefox does not include stream ID and SSN in the Forward TSN chunk.
Current pion/sctp relies on the stream ID for purge incomplete and abandoned fragments. Consequently, the fragments are left in the reassemblyQueue forever to cause receive buffer exhaustion.
I have repo'd the situation with both Chrome and Firefox. I will fix this asap.
Build failed.
# github.com/pion/sctp [github.com/pion/sctp.test]
./chunk_test.go:38:107: constant 3899461680 overflows int
@hugoArregui found this.
When bufferAmountLow is set to 0, OnBufferedAmountLow callback will never be made.
In the case, if the bufferedAmount is > 0 then reaches 0, the callback should be made. But with the current code, it wouldn't be made. See https://github.com/pion/sctp/blob/v1.6.9/stream.go#L286
The <
should be <=
. Also, we will need to make sure nBytesReleased is a positive value.
W3C WebRTC API says:
When the bufferedAmount decreases from above this threshold to equal or below it, the bufferedamountlow event fires. The bufferedAmountLowThreshold is initially zero on each new RTCDataChannel, but the application may change its value at any time.
I'm hitting the race below trying to upgrade backkem/go-libp2p-webrtc-direct to the latest pions/webrtc. The problem seems to be that the chunkPayloadData buffers are passed around to multiple goroutines and accessed concurrently. I haven't quite figured it out the details or solution yet thought.
==================
WARNING: DATA RACE
Write at 0x00c0005822c0 by goroutine 47:
encoding/binary.PutUvarint()
encoding/binary/varint.go:48 +0xb5
github.com/libp2p/go-mplex.(*Multiplex).sendMsg()
github.com/libp2p/go-mplex/multiplex.go:155 +0x2bf
github.com/libp2p/go-mplex.(*Stream).Close()
github.com/libp2p/go-mplex/stream.go:180 +0x11c
github.com/libp2p/go-libp2p-transport/test.SubtestPingPong.func1.1()
github.com/libp2p/go-libp2p-transport/test/transport.go:193 +0x310
Previous read at 0x00c0005822c0 by goroutine 146:
runtime.slicecopy()
runtime/slice.go:221 +0x0
github.com/pions/sctp.(*chunkPayloadData).marshal()
github.com/pions/sctp/chunk_payload_data.go:134 +0x4dc
github.com/pions/sctp.(*packet).marshal()
github.com/pions/sctp/packet.go:129 +0x3f4
github.com/pions/sctp.(*Association).send()
github.com/pions/sctp/association.go:930 +0x59
github.com/pions/sctp.(*Association).handleInbound()
github.com/pions/sctp/association.go:261 +0x160
github.com/pions/sctp.(*Association).readLoop()
github.com/pions/sctp/association.go:226 +0x135
Goroutine 47 (running) created at:
github.com/libp2p/go-libp2p-transport/test.SubtestPingPong.func1()
github.com/libp2p/go-libp2p-transport/test/transport.go:168 +0x20f
Goroutine 146 (running) created at:
github.com/pions/sctp.Client()
github.com/pions/sctp/association.go:141 +0x86
github.com/pions/webrtc.(*SCTPTransport).Start()
github.com/pions/webrtc/sctptransport.go:88 +0x150
github.com/pions/webrtc.(*PeerConnection).SetRemoteDescription.func2()
github.com/pions/webrtc/peerconnection.go:843 +0x557
==================
I'm using pions/webrtc datachannels in a project to do direct file sharing.
The file transfert is stable, but the speed drops heavily over time.
I'm using a datachannel initialized like this:
ordered := true
maxPacketLifeTime := uint16(10000)
dataChannel, err := s.peerConnection.CreateDataChannel("data", &webrtc.DataChannelInit{
Ordered: &ordered,
MaxPacketLifeTime: &maxPacketLifeTime,
})
Transmission speed shouldn't decrease that much over time.
The file transfer is stable, but the speed decreases over time, making it almost unusuable for long file transferts.
After cpu/mem profiling the issue, I'm pretty confident the issue is that abandoned packets are never removed from the orderedPackets
(payloadQueue
, payload_queue.go
). My program spends ~80% of its time in the Association.sendPayloadData
method (association.go
), and more specifically ~60% of its time sorting the orderedPackets
array.
I'm not sure if this behavior is due to a bug or due to a RFC spec not built-in yet.
Here is the profiling datas of a 50MB file transfert:
cpu-profiling.pdf
mem-profiling.pdf
I was writing test with sctp stream configured for unordered delivery.
Sending messages, although the order could be different, each messages are identical to the original messages.
Each message is broken when the message is larger than the max segment size, 1200.
I know the cause.
The assumption of the design was wrong. When I introduced pendingQueue which has ordered and unordered queues assuming that chunks in the unordered queue can be completely unordered. But 'unordered' should only apply to user messages. Not at the chunk level.
When a message is large, fragmented into multiple chunks with the current code, the message could be corrupted. (this has been repro'ed in my local test. PR is incoming)
Create a simple example by sending ping/pong messages with sequence numbers. The example should be using UDP for simplicity.
SCTP (datachannel) performance is perceived very low particularly over a real network with latency with limited bandwidth. No one appears to have properly measured performance. We should identify underlying problems causing the slowness with correct measurement, then tackle those to improve it.
Related to #51, other errors (than io.ErrShortBuffer) could be returned in the future. Also, the handling of error returned by ReadSCTP is not tested. For better detection of regression, we should add some tests for ReadSCTP.
Added two PeerConnections and established Opus streams between them
No logs.
Pion printf logged "RTX Failure: T1-init". The audio stream works fine.
Should this happen normally or is it a bug?
It would be great to have a tool to measure general performance. Developing something similar to iperf
would enable evaluating the performance in countless scenarios (loopback over memory only, loopback over IP, specific delay, packet loss, packet duplication, packet reordering, ...).
Compare performance against other stacks and determine spots for optimisation.
It may be a good idea to split the tool up for usage over raw sockets and for loopback usage over memory only. The former allows for interop and network performance testing while the latter can be used to test CPU usage.
The test protocol should be as simple as possible and specifics are up for discussion.
Currently Dial/Server will block until the association is done.
If the association never finish and close is called everything will deadlock.
I just got me a panic.
Failed to accept data channel: The association is closed
panic: close of closed channel
goroutine 148 [running]:
github.com/pions/sctp.(*Association).handleChunk(0xc0000a68c0, 0xc000173f40, 0x93ee40, 0xc00012afc0, 0x0, 0x0, 0x0, 0x0, 0x0)
/home/jch/go/src/github.com/pions/sctp/association.go:769 +0x427
github.com/pions/sctp.(*Association).handleInbound(0xc0000a68c0, 0xc000396000, 0x30, 0x2000, 0x30, 0x0)
/home/jch/go/src/github.com/pions/sctp/association.go:239 +0x1c4
github.com/pions/sctp.(*Association).readLoop(0xc0000a68c0)
/home/jch/go/src/github.com/pions/sctp/association.go:210 +0xf6
created by github.com/pions/sctp.Client
/home/jch/go/src/github.com/pions/sctp/association.go:135 +0x5b
Running SCTP v1.6.0 as part of the libp2p / go-libp2p-webrtc-direct CI (logs) still seems to cause some problems that are not limited to ICE:
SubtestPingPong
is failing as well (disabled in the CI logs linked above). These tests expect the Read & Write methods to operate with io.ReadWriter like semantics, E.g. return io.EOF, etc. My feeling is that these semantics may have changed in v1.6.0. This needs more digging, thought.Created minimal client / server example as shown here: https://gist.github.com/richp10/b5afc98353e548533385af55f587da63
Expected handshake to complete and be able to communicate across the stream.
CLIENT:
sctp DEBUG: 08:36:56.509457 association.go:712: [0xc00008c1a0] state change: 'Closed' => 'CookieWait'
sctp DEBUG: 08:36:56.512453 association.go:305: [0xc00008c1a0] sending INIT
sctp DEBUG: 08:36:56.509457 association.go:395: [0xc00008c1a0] readLoop entered
sctp DEBUG: 08:36:56.509457 association.go:418: [0xc00008c1a0] writeLoop entered
sctp DEBUG: 08:36:59.512983 association.go:305: [0xc00008c1a0] sending INIT
sctp DEBUG: 08:37:05.513388 association.go:305: [0xc00008c1a0] sending INIT
sctp DEBUG: 08:37:17.513699 association.go:305: [0xc00008c1a0] sending INIT
SERVER:
sctp DEBUG: 08:36:47.848844 association.go:395: [0xc0000761a0] readLoop entered
sctp DEBUG: 08:36:56.514451 association.go:747: [0xc0000761a0] chunkInit received in state 'Closed'
sctp WARNING: 2019/09/24 08:36:56 [0xc0000761a0] failed to write packets on netConn: write udp [::]:10001: wsasend: A request
to send or receive data was disallowed because the socket is not connected and (when sending on a datagram socket using a se
ndto call) no address was supplied.
sctp DEBUG: 08:36:56.515451 association.go:430: [0xc0000761a0] writeLoop ended
sctp DEBUG: 08:36:56.515451 association.go:446: [0xc0000761a0] writeLoop exited
sctp DEBUG: 08:36:59.512983 association.go:747: [0xc0000761a0] chunkInit received in state 'Closed'
sctp DEBUG: 08:37:05.513388 association.go:747: [0xc0000761a0] chunkInit received in state 'Closed'
sctp DEBUG: 08:37:17.513699 association.go:747: [0xc0000761a0] chunkInit received in state 'Closed'
This is more likely a misunderstanding on my part than a bug in the library, but I hope you can help in either event!
On CI environment,
=== RUN TestStats
--- PASS: TestStats (0.00s)
==================
WARNING: DATA RACE
Write at 0x00c0000b2098 by goroutine 83:
github.com/pion/sctp.(*fakeEchoConn).Read()
/home/travis/gopath/src/github.com/pion/sctp/association_test.go:2216 +0x180
github.com/pion/sctp.(*Association).readLoop()
/home/travis/gopath/src/github.com/pion/sctp/association.go:421 +0x271
Previous read at 0x00c0000b2098 by goroutine 27:
github.com/pion/sctp.TestStats()
/home/travis/gopath/src/github.com/pion/sctp/association_test.go:2314 +0x38c
testing.tRunner()
/home/travis/.gimme/versions/go1.13.6.linux.amd64/src/testing/testing.go:909 +0x199
Goroutine 83 (running) created at:
github.com/pion/sctp.(*Association).init()
/home/travis/gopath/src/github.com/pion/sctp/association.go:298 +0xdb
github.com/pion/sctp.Client()
/home/travis/gopath/src/github.com/pion/sctp/association.go:218 +0x9a
github.com/pion/sctp.TestStats()
/home/travis/gopath/src/github.com/pion/sctp/association_test.go:2307 +0x255
testing.tRunner()
/home/travis/.gimme/versions/go1.13.6.linux.amd64/src/testing/testing.go:909 +0x199
Goroutine 27 (running) created at:
testing.(*T).Run()
/home/travis/.gimme/versions/go1.13.6.linux.amd64/src/testing/testing.go:960 +0x651
testing.runTests.func1()
/home/travis/.gimme/versions/go1.13.6.linux.amd64/src/testing/testing.go:1202 +0xa6
testing.tRunner()
/home/travis/.gimme/versions/go1.13.6.linux.amd64/src/testing/testing.go:909 +0x199
testing.runTests()
/home/travis/.gimme/versions/go1.13.6.linux.amd64/src/testing/testing.go:1200 +0x521
testing.(*M).Run()
/home/travis/.gimme/versions/go1.13.6.linux.amd64/src/testing/testing.go:1117 +0x2ff
main.main()
_testmain.go:294 +0x347
==================
In order to pass the libp2p test cases used in backkem/go-libp2p-webrtc-direct we should be able to pass the TestAssocStressDuplex
test for about 15000000 messages. Right now, I'm able to make it work with around 1500 messages. More will break the current implementation and cause the test to hang forever.
As discussed before this is likely caused by a combination of SACK storm (reader far behind), packet loss and no re-transmission timer to fall back on.
This can probably be fixed by a combination of T3 timer (#11) and a simple congestion window (#14).
https://github.com/pion/dtls/blob/master/pkg/protocol/recordlayer/fuzz_test.go is a good place to start
In my test tool using SCTP, an attempt to close stream did not unblock Stream.ReadSCTP() when the association with the remote was already gone.
ReadSCTP() should be unblocked.
There is an error with this repository's Renovate configuration that needs to be fixed. As a precaution, Renovate will stop PRs until it is resolved.
Error type: undefined. Note: this is a nested preset so please contact the preset author if you are unable to fix it yourself.
client sends cookie but is ignored by the server because:
if state != cookieWait {
// RFC 4960
// 5.2.3. Unexpected INIT ACK
// If an INIT ACK is received by an endpoint in any state other than the
// COOKIE-WAIT state, the endpoint should discard the INIT ACK chunk.
// An unexpected INIT ACK usually indicates the processing of an old or
// duplicated INIT chunk.
return nil
}
Calculate the maximum MTU size that can travel the entire path machine1<->machine2.
The easiest way is probably to craft heartbeat messages starting at the maximum value until they are successfully returned. Potentially listen to ICMP PacketTooBig
There are more details in the ticket, but we found our application going into a CPU-intensive infinite loop. Profiling pointed us towards the markAllToRetransmit
function in payload_queue.go: https://github.com/pion/sctp/blob/master/payload_queue.go#L163
I'm not sure whether the infinite loop behaviour we saw was due to this race condition but it seems plausible.
For DCEP layer, we will need add a method "SetReliabilityParams()" to SCTP layer.
Until DCEP layer's handshake has been completed (received DATA_CHANNEL_ACK), any other all other messages containing user data and belonging to this Data Channel MUST be sent ordered, no matter whether the Data Channel is ordered or not accroding the DCEP spec.
We could also add a new argument to sendPayloadData() method, but I think "SetReliabilityParams()" would be cleaner..
Related discussion: pion/datachannel#9
I'd consider this as a post-v2 work item (no API change required)
The support of congestion control addressed in #11 has drastically improved the performance, but it is not optimal yet. To improve it further, we will need to implement delayed ack to reduce the number of packets per sec.
Say if you are handling 20MB/s of traffic, as the current MTU size (for SCTP) is set to 1200 bytes, SCTP (sender) is handling 17500 packets of outgoing DATA chunks and the same number of SACK packets incoming per second. That's a lot of CPU usage. We can drastically reduce the number of SACK chunks by implementing the delayed ack. (the handleSelectiveAck
routine is the most complex/expensive routine in sctp, and we could reduce 17500 SACK chunks to just 5! - in 20Mbps/200msec delayed ack). We should also piggy-back the ack on the outgoing DATA chunks also as recommended by RFC 4960.
No alternative I can think of right now.
SCTP should be in its own repo, and we should get some good testing like dtls
Some of the tests are here https://github.com/pions/webrtc/pull/294/files#diff-2d3fd3b7fe32b830d790a48d57c911f4R23
Sent 100 MB of data over a data channel, using MaxRetransmits set to 0 or 1, then the transmission stops due to an issue reported as pion/ice#12.
All 100 MB of data transfer should complete successfully.
The issue pion/ice#12 is a separate problem, but obviously, SCTP is pushing more than 1MB of data even though the receiver window is set to 128KB.
"1MB" is the maximum number of bytes ICE would buffer at most.
I confirmed that when this happens, the receiver is still reporting a_rwnd (advertised receiver window size) is 128KB (=fully available).
I noticed that the current receiver window size is calculated only by the payloadQueue size. This is because received DATA chunks are almost immediately handed off to the stream layer, and DATA chunks in the payloadQueue will be removed as soon as a.peerLastTSN advances, while those DATA chunks are still in the reasemblyQueue and waiting to be read by the application.
The rwnd to be advertised to the sender should use the number of bytes stored in the reassemblyQueue in its calculation, so that sender will stop sending data beyond the amount of data, or rwnd, the application has not read.
comcast --device=lo --latency=150 --packet-loss=10% --target-proto=udp
I get an ABORT after ~5 seconds.
Reass 30000007,CI:ffffffff,TSN=1e5e52d3,SID=0002,FSN=1e5e52d3,SSN:0000
This abort is generated here in usrsctp. I am not sure why yet though, I need to read more about SCTP to fully understand.
No one else worry about fixing this! I will open a PR for it
WARNING: DATA RACE
Read at 0x00c00045a301 by goroutine 64:
github.com/pions/sctp.(*Association).getPayloadDataToSend()
/home/sean/Documents/Programming/Go/Code/src/github.com/pions/sctp/association.go:960 +0x553
github.com/pions/sctp.(*Association).handleSack()
/home/sean/Documents/Programming/Go/Code/src/github.com/pions/sctp/association.go:683 +0x886
github.com/pions/sctp.(*Association).handleChunk()
/home/sean/Documents/Programming/Go/Code/src/github.com/pions/sctp/association.go:1150 +0xd59
github.com/pions/sctp.(*Association).handleInbound()
/home/sean/Documents/Programming/Go/Code/src/github.com/pions/sctp/association.go:347 +0x245
github.com/pions/sctp.(*Association).readLoop()
/home/sean/Documents/Programming/Go/Code/src/github.com/pions/sctp/association.go:317 +0x117
Previous write at 0x00c00045a301 by goroutine 52:
github.com/pions/sctp.(*Stream).SetReliabilityParams()
/home/sean/Documents/Programming/Go/Code/src/github.com/pions/sctp/stream.go:73 +0x9a
github.com/pions/datachannel.newDataChannel()
/home/sean/Documents/Programming/Go/Code/src/github.com/pions/datachannel/datachannel.go:30 +0x248
github.com/pions/datachannel.Client()
/home/sean/Documents/Programming/Go/Code/src/github.com/pions/datachannel/datachannel.go:84 +0x454
github.com/pions/datachannel.Dial()
/home/sean/Documents/Programming/Go/Code/src/github.com/pions/datachannel/datachannel.go:55 +0x79
github.com/pions/webrtc.(*DataChannel).open()
/home/sean/Documents/Programming/Go/Code/src/github.com/pions/webrtc/datachannel.go:138 +0x2b6
github.com/pions/webrtc.(*PeerConnection).openDataChannels()
/home/sean/Documents/Programming/Go/Code/src/github.com/pions/webrtc/peerconnection.go:926 +0x161
github.com/pions/webrtc.(*PeerConnection).SetRemoteDescription.func2()
/home/sean/Documents/Programming/Go/Code/src/github.com/pions/webrtc/peerconnection.go:917 +0x88e
Goroutine 64 (running) created at:
github.com/pions/sctp.Client()
/home/sean/Documents/Programming/Go/Code/src/github.com/pions/sctp/association.go:160 +0x7f
github.com/pions/webrtc.(*SCTPTransport).Start()
/home/sean/Documents/Programming/Go/Code/src/github.com/pions/webrtc/sctptransport.go:90 +0x123
github.com/pions/webrtc.(*PeerConnection).SetRemoteDescription.func2()
/home/sean/Documents/Programming/Go/Code/src/github.com/pions/webrtc/peerconnection.go:907 +0x735
Goroutine 52 (running) created at:
github.com/pions/webrtc.(*PeerConnection).SetRemoteDescription()
/home/sean/Documents/Programming/Go/Code/src/github.com/pions/webrtc/peerconnection.go:850 +0x16a3
github.com/pions/webrtc.signalPair()
/home/sean/Documents/Programming/Go/Code/src/github.com/pions/webrtc/peerconnection_test.go:72 +0x43f
github.com/pions/webrtc.closeReliabilityParamTest()
/home/sean/Documents/Programming/Go/Code/src/github.com/pions/webrtc/datachannel_test.go:50 +0x46
github.com/pions/webrtc.TestDataChannelParamters_Go.func1()
/home/sean/Documents/Programming/Go/Code/src/github.com/pions/webrtc/datachannel_go_test.go:175 +0x36c
testing.tRunner()
/usr/lib/go-1.12/src/testing/testing.go:865 +0x163
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.