Giter VIP home page Giter VIP logo

Comments (49)

schwabe avatar schwabe commented on May 27, 2024

I cannot reproduce this. Server side:

Server side:

/src//openvpn/openvpn --server 10.33.0.0 255.255.255.0 --server-ipv6 fd00:f00f::1/64  --topology subnet --cert ~/bclient.pem --key ~/bclient.pem --dev tun --verb 4  --data-ciphers chacha20-poly1305:AES-256-GCM:AES-128-GCM:AES-192-GCM --tun-mtu 1400 --dh none --keepalive 10 25 --config ~/fp --push "explicit-exit-notify 3"  --verb 4  --explicit-exit-notify 1
[...]
2022-12-08 12:07:23 us=551539 DCO device tun0 opened
[...]
^C2022-12-08 12:07:29 us=150785 event_wait : Interrupted system call (fd=-1,code=4)
2022-12-08 12:07:29 us=150854 SENT CONTROL [test-client]: 'RESTART' (status=1)
^C2022-12-08 12:07:30 us=795453 event_wait : Interrupted system call (fd=-1,code=4)

Client side:

2022-12-08 13:07:25 Initialization Sequence Completed
2022-12-08 13:07:29 Connection reset command was pushed by server ('')
2022-12-08 13:07:29 Closing TUN/TAP interface

from openvpn.

Originalimoc avatar Originalimoc commented on May 27, 2024

Not only this, also this(reneg 15 secs for debug):

TLS: soft reset sec=16/15 bytes=0/-1 pkts=0/0
TLS key negotiation failed to occur within 5 seconds (check your network connectivity)
TLS Error: TLS handshake failed
TLS: move_session: dest=TM_LAME_DUCK src=TM_ACTIVE reinit_src=1
TLS Error: TLS key negotiation failed to occur within 5 seconds (check your network connectivity)
TLS Error: TLS handshake failed
TLS Error: TLS key negotiation failed to occur within 5 seconds (check your network connectivity)
TLS Error: TLS handshake failed
...

Something is wrong with the control channel.

from openvpn.

Originalimoc avatar Originalimoc commented on May 27, 2024

I'm using dh and tls-crypt, try add those two? data-ciphers is AES-128-GCM

from openvpn.

schwabe avatar schwabe commented on May 27, 2024

reneg-sec 15 is extremely low value and you seem have even a 5s timeout there, so I am not surprised that your control breaks with those very aggresive timings. So please try to reproduce the issue with sensible timings. Convoluting this issue with agressive timing that may cause other problems is not helpful here.

from openvpn.

Originalimoc avatar Originalimoc commented on May 27, 2024

My network is super fast/low latency w/ effectively 0 packet loss. And yes, this happened on default 3600. I even never used reneg-*/hand-window configs before.

from openvpn.

Originalimoc avatar Originalimoc commented on May 27, 2024

add dh and tls-crypt to your config to see if you can reproduce, add cannotreproduce is too early

PS: my network ping -i 0 -c 1000

--- 10.0.0.1 ping statistics ---
1000 packets transmitted, 1000 received, 0% packet loss, time 16792ms
rtt min/avg/max/mdev = 67.754/69.313/71.813/0.329 ms, pipe 5, ipg/ewma 16.808/69.555 ms

So if TLS didn't complete even in one sec I can safely assume it's failed(a test by SIGUSR1):

45:58 us=957848 Restart pause, 1 second(s)
45:59 us=958289 Re-using SSL/TLS context
46:00 us=267117 Initialization Sequence Completed

I'll moving on without reneg it doesn't seem to matter with AES-128.

from openvpn.

schwabe avatar schwabe commented on May 27, 2024

@Originalimoc I cannot reproduce this. You need to provide proper instructions to reproduce this as I cannot reproduce this. dh and tls-crypt does not make a difference here. At least I posted my whole configuration instead of you who only posted some fragements.

Please try to also reproduce that with a minimal configuration without all kind of extra/unusual options to figure out if that is related to one of the options that you seem to be using. Try removing tls-crypt and see if you can reproduce it and so on.

from openvpn.

Originalimoc avatar Originalimoc commented on May 27, 2024

full both side config: #191 (comment)

from openvpn.

cron2 avatar cron2 commented on May 27, 2024

from openvpn.

Originalimoc avatar Originalimoc commented on May 27, 2024

It does happen on default. I already stated reneg 15 secs for debug from the very beginning

from openvpn.

cron2 avatar cron2 commented on May 27, 2024

from openvpn.

cron2 avatar cron2 commented on May 27, 2024

from openvpn.

Originalimoc avatar Originalimoc commented on May 27, 2024

To me both look like something is wrong with the control channel, so I post them both here. And full config is given: #191 (comment)

from openvpn.

schwabe avatar schwabe commented on May 27, 2024

@Originalimoc I did not happen for me with default setting as I already posted in my comment. So currently this not reproducible. YOu are adding all kind of questionable non default settings like hand-window 5, sndbuf , replay-window 5000 3 to your configs. I have to assume that the rest of your setup is just as sketchy as your configuration.

To repeat myself, unless you give a proper way to reproduce this issue, I consider this issue as non-reproducible. You have not even provided full logs from server and client yet.

from openvpn.

Originalimoc avatar Originalimoc commented on May 27, 2024

After disable server side dco, both problem(EEN/RENEG) goes away. And hand-window 5/reneg 15s work perfectly even it's "aggressive".

Nah...

from openvpn.

cron2 avatar cron2 commented on May 27, 2024

from openvpn.

cron2 avatar cron2 commented on May 27, 2024

from openvpn.

schwabe avatar schwabe commented on May 27, 2024

@Originalimoc let me tell you as someone that actually understands openvpn that these timing setting can cause problems that you do not even understand. But if you think you are more knowledgable of OpenVPN than I am, then please fix the issue yourself and psot a patch instead of questioning our knowlege of OpenVPN.

from openvpn.

Originalimoc avatar Originalimoc commented on May 27, 2024

In case you didn't read above. I already said:
1\ This happened on default 3600s reneg, reneg 15 secs just for debug
2\ I even never used reneg-*/hand-window configs before, like 5+ years.
3\ My network can highly reliably make a full TLS handshake in 0.3 sec as #189 (comment) shows.

from openvpn.

Originalimoc avatar Originalimoc commented on May 27, 2024

It's not my job either. I can move on without reneg necessary. It's all under a watchdog anyway I can just auto SIGUSR1 restart it all within 0.5 sec. I report, hoping for an improve, you patch if can, great, not really enter arguing mode, having a bad day huh @swb?

from openvpn.

schwabe avatar schwabe commented on May 27, 2024

@Originalimoc if you are not going to help us to figure out why it is not working on your side, I am just going to close the ticket.

from openvpn.

Originalimoc avatar Originalimoc commented on May 27, 2024

You're not even use the same config to say "it's not reproducible"

from openvpn.

Originalimoc avatar Originalimoc commented on May 27, 2024

@cron2 hand-window 5, reneg-sec, tran-window IS NOT THERE WHEN PROBLEM FIRST APPEAR. Can you read FGS?

from openvpn.

cron2 avatar cron2 commented on May 27, 2024

@cron2 hand-window 5, reneg-sec, tran-window IS NOT THERE WHEN PROBLEM FIRST APPEAR. Can you read FGS?

Whatever "FGS" is supposed to stand for, but yes, I can read. And I read that you are bringing up new and unrelated stuff all the time, which eats my time.

from openvpn.

cron2 avatar cron2 commented on May 27, 2024

For the original problem, it is caused by your aggressive reconnect timers.

In DCO mode (EEN in control channel) the server waits for 3 seconds after sending the EEN so all clients have time to ACK the control-channel message. Your client connects so fast that it hits the server "going down", and then the server goes down "for real" and will not send another EEN.

If you look properly into your client log files, you see that it receives the RESTART, reconnects, and then the server disappears (without another RESTART). We're going to fix this server-side by not allowing reconnect in this 2-3 second time frame.

from openvpn.

Originalimoc avatar Originalimoc commented on May 27, 2024

ForGodSake.
OK.
No.
You still did't read.
client log files, you see that it receives the RESTART
No.
but client receives nothing/log nothing, need a manual SIGUSR1 on client to reestablish connection.

original problem? Let's turn off reneg(0) for now. So you mean connect-retry?
From document EEN do send and run, don't need an ACK?
So messy right now.

from openvpn.

Originalimoc avatar Originalimoc commented on May 27, 2024

We better start clean, a new issue post with all info packed into one.

from openvpn.

cron2 avatar cron2 commented on May 27, 2024

EEN with DCO needs to be sent over the control channel, and control-channel packets always need an ACK. This is the way the OpenVPN protocol works (we could make the wait time shorter if all ACKs are in, but this would be a much larger code change, which nobody had time and interest to implement).

Old-style EEN are sent as part of the data channel (OCC messages), which does not work with DCO.

from openvpn.

cron2 avatar cron2 commented on May 27, 2024

We better start clean, a new issue post with all info packed into one.

Do not randomly open new issues. One issue per problem, relevant content in there, non-relevant content left out. If you feel part of your contributions have not been relevant, delete them.

from openvpn.

Originalimoc avatar Originalimoc commented on May 27, 2024

DONE, NOT A BUG?
Kernel module with data channel continues to work, but ov process itself entered a bad state that can't get RESTART already, yes, it can only get once per process lifecycle.

Can the wait be removed between SENT CONTROL [Client]: 'RESTART' (status=1) and SIGUSR1[hard,] received, process restarting, It takes 2 seconds, then another 3 sec restart pause. So at least 5s wait is needed on the client side. That's way too long, enough to take a round trip to the moon.

Better man page needed, warn regarding --connect-retry.

from openvpn.

schwabe avatar schwabe commented on May 27, 2024

@Originalimoc either change your attitude or go away. Telling us something like "for god's sake" and questioning our ability toi read is something I perceive as an insult. I do not like working with people that insult me.

from openvpn.

Originalimoc avatar Originalimoc commented on May 27, 2024

I don't like talking to you from beginning either thanks. Getting angry afterwards. c2 is clearly more actually knowledgeable on the subject.

from openvpn.

cron2 avatar cron2 commented on May 27, 2024

This patch (from @schwabe, as the subject matter expert on this) will sidestep the problem somewhat - when "in the process of shutting down", the server will no longer accept new connections from clients. So the race condition between "the client reconnects very quickly, and the server is not yet restarted" will no longer lead to clients thinking they have a valid connection, which isn't true anymore (= client has to wait for ping timeout).

Yes, we should probably rework the server side code, to take less time to actually shut down - but the openvpn event loop is complex, and not very good in doing things "really quick now!". So this will not happen in the 2.6 beta cycle, but we might revisit this in the "refactor big parts in master, after a release has been done" phase

from openvpn.

schwabe avatar schwabe commented on May 27, 2024

And I am not sure that the extra logic/code for doing a shutdown really quick is worth adding. Aside from this ticket, I never heard that the shutdown time of a server that waits 5s is causing problems. So adding 30+ lines of codes to fix such an obscure problem is not worth the complexity it adds.

from openvpn.

Originalimoc avatar Originalimoc commented on May 27, 2024

"Something stupid like --connect-retry 1 3" LOL.
No, the inability to maintain a good internal state and to instant reconnect in case of endpoint restart is stupid instead. And adding some arbitrary 60s/5s/3s/2s timeout without explain-able good reason and can break until a manual restart if peer "misbehave" is another extra stupid point. My upper tunnel can behave stateless and never fatally break if either side does anything configurable, at most wait a few sec it behave as new.

from openvpn.

Originalimoc avatar Originalimoc commented on May 27, 2024

It can/need to be a resource concern but not an architectural one. Unless you're EXPECTING server/client not working anymore of a session after an unexpected incoming connection, this is a serious bug.

from openvpn.

cron2 avatar cron2 commented on May 27, 2024

"Something stupid like --connect-retry 1 3" LOL. No, the inability to maintain a good internal state and to instant reconnect in case of endpoint restart is stupid instead. And adding some arbitrary 60s/5s/3s/2s timeout without explain-able good reason and can break if peer "misbehave" is another extra stupid point.

This is not "arbitrary". It's called exponential backoff, and is standard practice for well-behaving software. If you have your first system go down because a software retried in milliseconds and logged every attempt, filling all disk space in the process, you'll understand.

from openvpn.

Originalimoc avatar Originalimoc commented on May 27, 2024

Not the point.
The starting point is off. Why do you need 60s timeout, are you talking to Mars or something. And why server need a restart pause, its initial state connects to nothing.

from openvpn.

cron2 avatar cron2 commented on May 27, 2024

It can/need to be a resource concern but not an architectural one. Unless you're EXPECTING server/client not working anymore of a session after an unexpected incoming connection, this is a serious bug.

No, we did not expect this. Control-channel EEN is new, and you triggered an unexpected behaviour by using more aggressive timers than our test framework does. So @schwabe fixed it.

Was this an annoying side effect? I'm sure it was. Was it "a serious bug"? Certainly not. No side crashed, no security impact.

from openvpn.

cron2 avatar cron2 commented on May 27, 2024

Not the point. The starting point is off. Why do you need 60s timeout, are you talking to Mars or something. And why server need a restart pause, its initial state connects to nothing.

The starting point is 5s, first retry 5s, then 10, 20,... - and if a remote server is down hard, retrying every 60 seconds is a very reasonable compromise between "reconnecting quickly" and "needlessly burning resources"

from openvpn.

Originalimoc avatar Originalimoc commented on May 27, 2024

The EEN part essentially crashed because on next session peer receive nothing.

from openvpn.

schwabe avatar schwabe commented on May 27, 2024

And besides the defaults are chosen to be a compromise that works for as many people as possible. Your setup might be able to tolerate a 3s retry from clients but that is not the case for everyone. I implmeented this exponentional backup since mobile phones that did loose server connection for an hour or two would quickly burn their battery with the default of 5s.

from openvpn.

schwabe avatar schwabe commented on May 27, 2024

The EEN part essentially crashed because on next session peer receive nothing.

I don't think you understand what crashed means. "crashed" means the program is not able to recover. And your "big problem" is only caused because you tweak all kind of obscure settings in even more obscure ways. So you walking besides the established and well-tested paths. So that your setup has more problem than the ones that run with well-tested defaults is not surprising at all.

from openvpn.

Originalimoc avatar Originalimoc commented on May 27, 2024

Hand window is 60.
Mobile side is better with persist key and no ping and float.

from openvpn.

Originalimoc avatar Originalimoc commented on May 27, 2024

Nah the state maintenance still sucks. I now configured to bypassed these. Not mean there's no bug though, way too much legacy debt.

from openvpn.

cron2 avatar cron2 commented on May 27, 2024

Hand window is 60. Mobile side is better with persist key and no ping and float.

And what, exactly, does this have to do with the issue here?

Nah the state maintenance still sucks. I now configured to bypassed these. Not mean there's no bug though.

I will now block you. The way you refuse to listen, and keep being offensive ("sucks") is really a waste of our time.

from openvpn.

schwabe avatar schwabe commented on May 27, 2024

Hand window is 60. Mobile side is better with persist key and no ping and float.

You are just trolling at this point. I don't think there is any use in continuing the coversation with you.

from openvpn.

OrimoB avatar OrimoB commented on May 27, 2024

Huh blocking, that's a good move. I'm here spending time to praise you or what? Coding from ground up is better than deal with all these legacy bad decision without proper state management mess and that's is exactly what I'll do next. You are the one that refused to listen and insisting "this is good and well tested", live in your old dream. What sucks, remains sucks.

from openvpn.

cron2 avatar cron2 commented on May 27, 2024

Huh blocking, that's a good move. I'm here spending time to praise you or what? Coding from ground up is better than deal with all these legacy bad decision without proper state management mess and that's is exactly what I'll do next. You are the one that refused to listen and insisting "this is good and well tested", live in your old dream. What sucks, remains sucks.

If you pay for software and support, feel free to be as unfriendly as you want. If you get some piece of software for free, and get support for free as well, we expect you to spend some of your time on "be friendly and provide the information that is asked from you" in return.

You prefer to insult us, which is okay, but we are free to just not listen to you. So, yes, block.

from openvpn.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.