Posted: Fri Sep 09, 2022 1:51 Post subject: Possible causes for WG bandwidth only 65% of OpenVPN?
I replaced two WNDR3700s configured to provide a site-to-site routed OVPN tunnel with two R7800s configured to provide a site-to-site WG tunnel to obtain increased bandwidth, but found the WG tunnel actually provides significantly less bandwidth. Because published tests almost universally show WG as providing substantially more bandwidth, I believe something is wrong and am looking for suggestions as to what it might be.
The sites are about 100 miles apart with different ISPs. For comparison testing, I added an OVPN tunnel to the R7800s, with site #1 running the server and site #2 running the client. Testing with iperf3 shows WG bandwidth from site #1 to site #2 is only about 65% of OVPN bandwidth and WG bandwidth from site #2 to site #1 is about 93% of OVPN bandwidth. (WG was tested with the openvpnserver service stopped and OVPN was tested with the WG tunnel stopped.)
Here is a summary of the test results (Mbps):
Code:
Site Baseline - Speedtest.net OVPN WG
up down to S#2 from S#2 to S#2 from S#2
#1 46.6 49.5 44.6 7.3 29.7 6.8
#2 8.0 150+
tracert shows the expected same four hops for OVPN and WG (source pc > switch gateway > S#1 router > tunnel S#2 [OVPN | WG ] endpoint > destination pc), but the RTT for the tunnel S#2 endpoint is consistently much higher for WG than OVPN (~32 ms vs. ~19 ms).
Wireshark does not show any dropped packets, retransmissions or other communication issues during the iperf tests. It does show OVPN has less overhead. OVPN packets carry 1285 bytes of data in 1339 total bytes (1.94% overhead) and WG packets carry 1400 bytes of data in 1454 total bytes (3.71% overhead).
I do not know if WG's greater latency and overhead are expected or anomalous, or whether they explain WG's lower bandwidth.
WG is configured with defaults following egc's helpful set-up guides, as follows:
For both peers:
CVE-2019-14899 Mitigation: Disable
NAT via Tunnel: Disable
Listen Port: 51820
MTU: 1440
Local Public Key: XXXXX
DNS Servers via Tunnel:
Firewall Inbound:
Kill Switch:
Local Private Key: XXXXX
Route up Script:
Route down Script:
Firewall Mark:
Bypass LAN Same-Origin Policy:
Source Routing: Route All Sources via VPN
Destination Routing: Route All Destinations via Default Route
Failover Member: Disable
Watchdog: Disable
For site #1 peer:
1. Peer 2
Endpoint address: <site #2 FQDN>:51820
Allowed IPs: <site #2 subnet/24>,<tunnel subnet site #2 endpoint/32>
Route Allowed IPs via Tunnel: Enable
Persistent Keepalive: 20
Peer Public Key: XXXX
Use Pre-Shared Key: Enable
Pre-Shared Key: XXXXX
IP Addresses/Netmask (CIDR): <tunnel subnet site #1 endpoint/24>
For site #2 peer:
1. Peer 1
Endpoint address: <site #1 static ip>:51820
Allowed IPs: <site #1 subnet/24>,<tunnel subnet/24>
Route Allowed IPs via Tunnel: Enable
Persistent Keepalive: 20
Peer Public Key: XXXX
Use Pre-Shared Key: Enable
Pre-Shared Key: XXXXX
IP Addresses/Netmask (CIDR): <tunnel subnet site #2 endpoint/24>
Any suggestions for why the WG bandwidth is less, and not more, than the OVPN bandwidth would be much appreciated.
Neither ISP connection uses PPPoE. Reducing the MTU on both sides made no material difference.
With MTU set at the default 1440, iperf directly between the routers (previous tests were between PCs on the subnets connected to the routers) showed 27.0 Mbps. iperfs with MTUs of 1420 and 1280 were almost the same as with MTU of 1440, 26.9 Mbps. Using tcpdump and Wireshark to see the iperf traffic, with MTU set to 1440, and S1 acting as the iperf client, the packets from S1 to S2 were 1514 bytes with 1440 bytes of data and there were occasional 138 byte packets from S2 to S1 carrying 64 bytes of data. With MTU set to 1280, the S1>S2 packets were 1354 bytes with 1280 bytes of data and the occasional S2>S1 packets were 138 bytes with 64 bytes of data.
I tested the tunnels with ping directly between the routers. First was OVPN. With MTU set to 1500, the maximum size permitted was 1419 bytes and pings at that size or below had an average RTT in the range of 16.5 - 17.5 ms. Next was WG. WG apparently does not include a don't fragment flag because ping worked for all sizes tested, ranging from 512 bytes to 5120 bytes (and Wireshark confirmed the don't fragment flag is not set). The average RTT for each of the sizes tested was about 33.5 ms.
Testing with ping between PCs on the subnets at either end of the tunnel, and using Windows ping's don't fragment flag, showed the maximum size for WG to be 28 bytes less than the tunnel MTU setting, and gave the same 27 Mbps RTT as with ping between the routers.
In sum, reducing the WG MTU did not improve the WG bandwidth and neither reducing the WG MTU nor capping the packet size below the MTU improved the WG RTT.
Any recommendations as to other adjustments to try or tests to perform?
Joined: 18 Mar 2014 Posts: 12814 Location: Netherlands
Posted: Sun Sep 11, 2022 6:50 Post subject:
One other thing which came to mind is the testing method, if you use iperf3 on the router itself the results are not reflecting the real possible throughput.
My first post shows the results for iperf running on PCs connected to the routers. My second post shows the results for iperf running directly on the routers. The results were consistent. In both cases, the OVPN RTT was roughly 50% of the WG RTT and the WG bit rate was about 60 - 65% of the OVPN bit rate.
I tried changing the WG ports to a more common port, e.g., 2001, and the results were the same.
I am out of ideas for further testing/troubleshooting. Any suggestions?
If possible, I would try using the router's WG client w/ a commercial VPN provider just to see if the performance improves significantly when connected to something other than your own WG server. If things do improve, then that would suggest it's something specific to the site-to-site config.