Comments (10)
Bandaid patch v3 has been merged and will be part of the upcoming 2.6_rc1 version. With that, @schwabe and I can no longer break TCP servers this way.
from openvpn.
Possibly related, a few minutes after restart we have an openvpn process that is pegging at 100%
recvfrom(-1, 0x55b337b832b8, 2330, 0, 0x55b337b66810, [28]) = -1 EBADF (Bad file descriptor)
recvmsg(-1, {msg_namelen=128}, MSG_ERRQUEUE) = -1 EBADF (Bad file descriptor)
write(1, "9b24ef5845c205e04e757b933ffa70e6"..., 109) = 109
recvfrom(-1, 0x55b337b832b8, 2330, 0, 0x55b337b66810, [28]) = -1 EBADF (Bad file descriptor)
recvmsg(-1, {msg_namelen=128}, MSG_ERRQUEUE) = -1 EBADF (Bad file descriptor)
write(1, "9b24ef5845c205e04e757b933ffa70e6"..., 109) = 109
recvfrom(-1, 0x55b337b832b8, 2330, 0, 0x55b337b66810, [28]) = -1 EBADF (Bad file descriptor)
recvmsg(-1, {msg_namelen=128}, MSG_ERRQUEUE) = -1 EBADF (Bad file descriptor)
write(1, "9b24ef5845c205e04e757b933ffa70e6"..., 109) = 109
recvfrom(-1, 0x55b337b832b8, 2330, 0, 0x55b337b66810, [28]) = -1 EBADF (Bad file descriptor)
recvmsg(-1, {msg_namelen=128}, MSG_ERRQUEUE) = -1 EBADF (Bad file descriptor)
write(1, "9b24ef5845c205e04e757b933ffa70e6"..., 109) = 109
from openvpn.
from openvpn.
Unfortunately no. For the first error, this is all I got on the log
Dec 7 12:21:12 eduvpn-n09 openvpn[147602]: d400821cdfd1c0294d1ec1b8bd15b768/2001:9e8:xxx MULTI: Learn: 2001:4ca0:2fff:2:3:0:7:1005 -> d400821cdfd1c0294d1ec1b8bd15b768/2001:9e8:xxx
Dec 7 12:21:12 eduvpn-n09 openvpn[147602]: d400821cdfd1c0294d1ec1b8bd15b768/2001:9e8:xxx MULTI: primary virtual IPv6 for d400821cdfd1c0294d1ec1b8bd15b768/2001:9e8:xxx: 2001:4ca0:2fff:2:3:0:7:1005
Dec 7 12:21:12 eduvpn-n09 openvpn[147602]: d400821cdfd1c0294d1ec1b8bd15b768/2001:9e8:xxx read TCPv6_SERVER []: Bad file descriptor (fd=-1,code=9)
Dec 7 12:21:12 eduvpn-n09 openvpn[147602]: d400821cdfd1c0294d1ec1b8bd15b768/2001:9e8:xxxread TCPv6_SERVER []: Bad file descriptor (fd=-1,code=9)
And in the second case logging was disabled, I attached to the running openvpn process using strace.
from openvpn.
from openvpn.
So, looking a bit more closely into our code (socket.c and mtu.c) - the combination of recvfrom() and recvmsg() is only ever done for UDP, so it seems the second log might be a different issue. We currently have no idea why the file descriptor might change to "-1" without (especially in the UDP case) the process just ending - there are no races with "pass socket to kernel, userland must no longer use it" anymore...
from openvpn.
0001-WIP-ASSERT-if-sock-fd-passed-to-recv-is-not-0-GH-iss.txt
I have attached a patch that adds ASSERT(sock && sock->fd > 0)
to every place where we recv()
or recvfrom()
etc. from a file descriptor - one for TCP, two for UDP (ignoring the one in mtu.c). This is not a bugfix, but if you could run a server instance with verb 6
and DCO enabled, and this problem happens again, the server will stop and the log file should hopefully give us some more hints how we managed to break things...
from openvpn.
JFTR, I managed to reproduce the crash for a TCP-based server, with the ASSERT() patch above
Dec 18 12:30:26 ubuntu2004 tun-tcp-p2mp-username-cn[1515354]: gremlin52251/2001:608:0:814::f000:21 dco_install_key: peer_id=106 keyid=0, currently 0 keys installed
Dec 18 12:30:26 ubuntu2004 tun-tcp-p2mp-username-cn[1515354]: gremlin52251/2001:608:0:814::f000:21 dco_new_key: slot 0, key-id 0, peer-id 106, cipher AES-256-GCM
Dec 18 12:30:26 ubuntu2004 tun-tcp-p2mp-username-cn[1515354]: gremlin52251/2001:608:0:814::f000:21 SENT CONTROL [gremlin52251]: 'PUSH_REPLY,route 10.220.0.0 255.255.255.0,route 10.220.128.0 255.255.128.0,route-ipv6 fd00:abcd:220::/48,tun-ipv6,route-gateway 10.220.112.1,topology subnet,ping 10,ping-restart 30,ifconfig-ipv6 fd00:abcd:220:112::11d0/64 fd00:abcd:220:112::1,ifconfig 10.220.113.210 255.255.252.0,peer-id 106,cipher AES-256-GCM,protocol-flags cc-exit tls-ekm' (status=1)
Dec 18 12:30:26 ubuntu2004 tun-tcp-p2mp-username-cn[1515354]: gremlin52251/2001:608:0:814::f000:21 Assertion failed at socket.c:3361 (sock && sock->sd >= 0)
Dec 18 12:30:26 ubuntu2004 tun-tcp-p2mp-username-cn[1515354]: gremlin52251/2001:608:0:814::f000:21 Exiting due to fatal error
(won't help someone who needs a working server :-) - but I hope this gives us some logging to understand what weird flow of events made us arrive there - there's nothing in the code that would ever set sock->sd to -1...)
I have not been able to make a UDP server crash, but brute-forcing ("5.000 client connects in 90 minutes") uncovered some other interesting misbehaviours...
from openvpn.
https://patchwork.openvpn.net/project/openvpn2/patch/[email protected]/
I do have a patch that bandaid-fixes the issue for me - that is, I can reproduce the TCP server crash, and with the fix, it will just kill the "broken" client instance.
Dec 22 10:49:25 ubuntu2004 tun-tcp-p2mp-username-cn[1659541]: gremlin50083/2001:608:0:814::f000:21 dco_install_key: peer_id=258 keyid=0, currently 0 keys installed
Dec 22 10:49:25 ubuntu2004 tun-tcp-p2mp-username-cn[1659541]: gremlin50083/2001:608:0:814::f000:21 dco_new_key: slot 0, key-id 0, peer-id 258, cipher AES-256-GCM
Dec 22 10:49:25 ubuntu2004 tun-tcp-p2mp-username-cn[1659541]: gremlin50083/2001:608:0:814::f000:21 SENT CONTROL [gremlin50083]: 'PUSH_REPLY,route 10.220.0.0 255.255.255.0,route 10.220.128.0 255.255.128.0,route-ipv6 fd00:abcd:220::/48,tun-ipv6,route-gateway 10.220.112.1,topology subnet,ping 10,ping-restart 30,ifconfig-ipv6 fd00:abcd:220:112::110c/64 fd00:abcd:220:112::1,ifconfig 10.220.113.14 255.255.252.0,peer-id 258,cipher AES-256-GCM,protocol-flags cc-exit' (status=1)
Dec 22 10:49:25 ubuntu2004 tun-tcp-p2mp-username-cn[1659541]: gremlin50083/2001:608:0:814::f000:21 BUG: link_socket_read_tcp(): sock->sd==-1, reset client instance
Dec 22 10:49:25 ubuntu2004 tun-tcp-p2mp-username-cn[1659541]: gremlin50083/2001:608:0:814::f000:21 Connection reset, restarting [0]
Dec 22 10:49:25 ubuntu2004 tun-tcp-p2mp-username-cn[1659541]: gremlin50083/2001:608:0:814::f000:21 SIGUSR1[soft,connection-reset] received, client-instance restarting
Dec 22 10:49:25 ubuntu2004 tun-tcp-p2mp-username-cn[1659541]: dco_del_peer: peer-id 258
The current theory about the underlying issue is "an incoming TCP session close in just the wrong moment", so that client's session would be broken anyway. I consider this to be a bandaid, because it would be preferrable to never be in this situation in the first place - but fixing the underlying issue might take longer. So, for 2.6.0 with DCO, this should get the job done.
@bernhardschmidt please see if you can still break it :-)
from openvpn.
We currently do not run DCO due to other bugs, but I remember this being fixed with rc1 (hit us before within minutes).
Closing as suggested by cron2, thanks for the fix.
from openvpn.
Related Issues (20)
- p2p tun configs break with new topology default in non-obvious ways HOT 8
- OpenVPN with mbed TLS: no warning for unsupported LZO compression — successfully connects without warning but not operable HOT 8
- DNS for remote server not refreshed after power hibernation and restoring HOT 3
- --preresolve is not documented HOT 1
- Installation package download problem HOT 2
- key_state_gen_auth_control_files has subtle logic mistake HOT 2
- The OpenVPN process exits unexpectedly when using the DCO kernel module HOT 13
- tapctl.exe creates an adapter, but fails to rename it HOT 5
- Problems when reconnecting OpenVPN HOT 1
- I'm getting a certificate error when I use OpenVPN to access a website with HSTS turned on.
- The openvpn client suddenly disconnects HOT 3
- VPN stop working HOT 4
- Debian / Ubuntu: OpenVPN apt repositories HOT 2
- Unfair treatment for "Stub" Compression push? HOT 4
- connect error on kali linux HOT 9
- The visited host is unable to obtain the client IP of OpenVPN, only the IP of the OpenVPN server will be obtained HOT 1
- Cannot connect more than one client from behind a NAT firewall HOT 12
- openvpn tls handshake error in some isp like mci HOT 1
- Can openvpn’s open ports handle the following attacks? HOT 5
- Continuously sending DNS (queries/responses) HOT 4
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from openvpn.