Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Connection delay after openvpn server stop and start #689

Open
ayappanec opened this issue Feb 13, 2025 · 10 comments
Open

Connection delay after openvpn server stop and start #689

ayappanec opened this issue Feb 13, 2025 · 10 comments

Comments

@ayappanec
Copy link

We are seeing connection delay of around 15 minutes between the server and the client when we stop the server and start it again. The server and client are both AIX machines. The openvpn used is 2.6.10 version. The client gets the IP immediately after the server starts, but the pings are working only after around 15 minutes.
Any inputs are really helpful.

server.conf
########
port 443
proto tcp
dev tap

ca /openvpn-keys/ca.crt
cert /openvpn-keys/server.crt
dh /openvpn-keys/dh2048.pem

server 10.8.0.0 255.255.255.0
ifconfig-pool-persist ipp.txt
client-to-client
keepalive 10 120
cipher AES-256-CBC
status openvpn-status.log
verb 3
explicit-exit-notify 1

########
client.conf
########
client
dev tap
proto tcp

remote X.X.X.X 443
resolv-retry infinite

nobind

ca /openvpn-keys/ca.crt
cert /openvpn-keys/client1.crt
key /openvpn-keys/client1.key
remote-cert-tls server
tls-auth /openvpn-keys/ta.key 1
cipher AES-256-CBC
verb 3

@cron2
Copy link
Contributor

cron2 commented Feb 13, 2025

Log files or it didn't happen...

@ayappanec
Copy link
Author

Enabled the verbosity to level 9 , and I see the below ( I xxxx'ed the hostname and Ip address in the below log) .

2025-02-13 04:19:38 us=875832 XXXXX/XXXXX:63778 GET INST BY VIRT: 00:bd:28:9c:bb:08@0 [failed]
2025-02-13 04:19:38 us=875839 STREAM: SET NEXT, buf=[136,0] next=[136,1800] len=-1 maxlen=1800
2025-02-13 04:19:38 us=875844 MULTI TCP: multi_tcp_post TA_SOCKET_READ -> TA_TUN_WRITE
2025-02-13 04:19:38 us=875850 MULTI TCP: multi_tcp_action a=TA_TUN_WRITE p=1
2025-02-13 04:19:38 us=875855 MULTI TCP: multi_tcp_wait_lite a=TA_TUN_WRITE mi=0x110077f50
2025-02-13 04:19:38 us=875861 PO_CTL rwflags=0x0000 ev=9 arg=0x110003870
2025-02-13 04:19:38 us=875866 PO_CTL rwflags=0x0002 ev=7 arg=0x1100013a4
2025-02-13 04:19:38 us=875873 I/O WAIT Tr|TW|Sr|Sw [1/0]
2025-02-13 04:19:38 us=875884 PO_WAIT[1,0] fd=7 rev=0x00000002 rwflags=0x0002 arg=0x1100013a4
2025-02-13 04:19:38 us=875890 event_wait returned 1
2025-02-13 04:19:38 us=875895 I/O WAIT status=0x0008
2025-02-13 04:19:38 us=875900 MULTI TCP: multi_tcp_dispatch a=TA_TUN_WRITE mi=0x110077f50
2025-02-13 04:19:38 us=875907 XXXXX/XXXXX:63778 TUN WRITE [98]
2025-02-13 04:19:38 us=875915 XXXXX/XXXXX:63778 write to TUN/TAP returned 98
2025-02-13 04:19:38 us=875921 STREAM: SET NEXT, buf=[136,0] next=[136,1800] len=-1 maxlen=1800
2025-02-13 04:19:38 us=875927 MULTI TCP: multi_tcp_post TA_TUN_WRITE -> TA_UNDEF

@cron2
Copy link
Contributor

cron2 commented Feb 13, 2025

verb 9 was not asked for and is not helpful either - it includes way too much information which is not relevant when diagnosing problems that are not related to internal key handling etc.

A verb 3 logfile from client and server, showing the initial connection and the reconnect after server restart would be more useful.

That said, I guess it's likely that you are hitting an arp issue - given that AIX only supports TAP mode, TAP presents itself as a virtual ethernet, and ethernet has ARP tables for mapping IP to Ethernet addresses, which are cached(!)... - when the client reconnects both sides might assign itself a new (random) MAC address, which might need to be aged-out on the other side.

So, if you can reproduce the issue, please try if using arp -d $ip on both sides (clearing the ARP entry for the IP address of the other end) will help. arp -an will show you the tables, and netstat -in should show the MAC address currently assigned to the TAP interface.

On non AIX systems, one could fix the TAP interface MAC using lladdr m:a:c:a:dd:rr in the OpenVPN config, but this has not (yet) been implemented for AIX. So you could script the arp -d call using an --up script in your config... ugly? yes...

There is a "tun mode" emulation for AIX in the works, but that has been stalled because lack of time (and nobody seems to be using OpenVPN on AIX)

@ayappanec
Copy link
Author

@cron2 Thanks for the detailed explanation. I was not getting any logs in the server when pinging from the client side ( when there was connection issue ), hence increased the verbosity to 9.
You are right. It's indeed a arp cache problem. Running arp -d $ip fixed the issue.

You mentioned that the "tun mode" emulation in AIX is stalled. We would like to understand more about that and how we can help here ( like how much more work is required and what technical skills are required).
We are from AIX Toolbox team. We provide lot of open source softwares to AIX users as rpm packages through this platform (openvpn is also one of them) --> https://www.ibm.com/support/pages/aix-toolbox-open-source-software-downloads-alpha

@cron2
Copy link
Contributor

cron2 commented Feb 13, 2025

Hi. Glad to hear that it was indeed an ARP problem and there is a manual solution, at least.

Since you mentioned you're from the Toolbox team - maybe you have better documentation that I have :-) - I'd like to implement lladdr support in OpenVPN, so we could set the TAP adapter's MAC address to a configured value, avoiding the ARP cache problem. On other OSes, there is usually "something ifconfig" to set the hardware / ether address, but my AIX's "man ifconfig" does not mention anything. Is there a way?

On the TUN emulation layer - what I have works for IPv4, but IPv6 support is missing. Also it needs rebasing to the latest master, last time I worked on it was in early 2023... - so it mostly needs time. But since you voiced interest, I'll see if I can find a bit of time and polish it for inclusion in 2.7... (not promising anything). WRT "what skills are needed" - well, understanding IPv6 neighbour discovery, and understanding OpenVPN C code...

@ayappanec
Copy link
Author

ayappanec commented Feb 17, 2025

Thanks again @cron2 .

In AIX, chdev command can be used to set/change the hardware address. Something like "chdev -l entx -a alt_addr <new Mac>". https://www.ibm.com/docs/en/aix/7.2?topic=cards-adapter-management-configuration

@cron2
Copy link
Contributor

cron2 commented Feb 18, 2025

Thanks for the link. I have tried it, but either I'm doing something wrong, or it is plainly not supported on tap interfaces...

m-gd@aix$ sudo /usr/sbin/chdev -l tap1 -a alt_addr=0X10005A4F1B7F                         <
Method error (/usr/lib/methods/chgif):
        0514-018 The values specified for the following attributes 
                 are not valid:
0821-228 chgif: Bad attribute(s) or attribute value(s): alt_addr 

m-gd@aix$ sudo /usr/sbin/chdev -l tap1 -a use_alt_addr=yes       
Method error (/usr/lib/methods/chgif):
        0514-018 The values specified for the following attributes 
                 are not valid:
0821-228 chgif: Bad attribute(s) or attribute value(s): use_alt_addr 

this is on a 7.3 machine, and the 7.3 documentation still explains this (but only for ethernet and token ring) :-(

@ayappanec
Copy link
Author

alt_addr is part of the underlying adapter interface , in this case tapent1.
lsdev should list all the devices and lsattr -El <dev> will list the attributes of the device.

@cron2
Copy link
Contributor

cron2 commented Feb 18, 2025

Okay, so lsattr -El tapent1 indeed shows these attributes

m-gd@aix$ lsattr -El tapent1
alt_addr     0x000000000000 Alternate EtherChannel Address        True
jumbo_frames no             Enable Gigabit Ethernet Jumbo Frames  True
use_alt_addr no             Enable Alternate EtherChannel Address True

it won't permit me to actually use them, though...

m-gd@aix$ sudo /usr/sbin/chdev -l tapent1 -a alt_addr=0X10005A4F1B7F
Method error (/usr/lib/methods/chgent):
        0514-062 Cannot perform the requested function because the
                 specified device is busy.

(this is with a tap1 interface which was created with ifconfig tap1 create and not configured further. UP or ifconfig tap1 down does not make a difference, it's always busy...

Do you have a working and complete example of creating and changing the MAC address for a given tap interface? I can then integrate that into openvpn lladdr.c - but at this point, I do not really see how this is supposed to work...

@ayappanec
Copy link
Author

Sure. Let me check and come back with a working example.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants