Posted: Tue Jan 24, 2012 21:44 Post subject: Temporary connection failures with Asus WL500G + build 14929
Hi all,
My setup: Asus WL500GP v2, DD-WRT build 14929 (v24-sp2 mini), connected to Unitymedia cable Internet via DHCP.
At least every hour (DHCP lease time = 1 hour), and sometimes occassionally, I get a temporary disconnect and the following messages appear in my /var/log/messages:
Quote:
Jan 24 22:15:37 router user.info syslog: DDNS : inadyn daemon successfully started
Jan 24 22:15:39 router user.emerg kernel: ip_nat_pptp version 1.5 unloaded
Jan 24 22:15:39 router user.emerg kernel: ip_conntrack_pptp version 1.9 unloaded
Jan 24 22:15:39 router user.info syslog: vpn modules : vpn modules successfully unloaded
Jan 24 22:15:39 router user.info syslog: vpn modules : ip_conntrack_proto_gre successfully loaded
Jan 24 22:15:39 router user.info syslog: vpn modules : ip_nat_proto_gre successfully loaded
Jan 24 22:15:39 router user.emerg kernel: ip_conntrack_pptp version 1.9 loaded
Jan 24 22:15:39 router user.info syslog: vpn modules : ip_conntrack_pptp successfully loaded
Jan 24 22:15:39 router user.emerg kernel: ip_nat_pptp version 1.5 loaded
Jan 24 22:15:39 router user.info syslog: vpn modules : ip_nat_pptp successfully loaded
I'm using Smokeping to monitor the connection towards the router, the modem, and two Internet hosts. In the ICMP ping (1 req/sec), I see ~3% packet loss towards the modem and the two Internet hosts, although there's no other traffic and the router's memory isn't an issue. When I directly connect the Smokeping box to the cable modem (model SA EPC2203), ommitting the router, I have *no* packet losses. I tried to flash some more recent builds on the router, but all had the same issues.
I found some potentially related problems in this forum [1][2], and have some questions:
* Why does renewing DHCP information negatively affect my connection?
* Where is the modem log stored to that "kill -USR1 $(head -n 1 /var/run/udhcpc.pid)" triggers?
* What may be the reason for packet losses by the router, as directly connecting my Smokeping box work pretty well?
I found out the timeouts are exactly every half an hour, as the following log shows:
Code:
Jan 25 06:29:07 router user.info syslog: WAN is up. IP: 95.223.XX.YY
Jan 25 06:59:14 router user.info syslog: WAN is up. IP: 95.223.XX.YY
Jan 25 07:29:21 router user.info syslog: WAN is up. IP: 95.223.XX.YY
Jan 25 07:59:28 router user.info syslog: WAN is up. IP: 95.223.XX.YY
Jan 25 08:29:35 router user.info syslog: WAN is up. IP: 95.223.XX.YY
Jan 25 08:59:42 router user.info syslog: WAN is up. IP: 95.223.XX.YY
Jan 25 09:29:51 router user.info syslog: WAN is up. IP: 95.223.XX.YY
Jan 25 09:59:58 router user.info syslog: WAN is up. IP: 95.223.XX.YY
Jan 25 10:30:09 router user.info syslog: WAN is up. IP: 95.223.XX.YY
Jan 25 11:00:15 router user.info syslog: WAN is up. IP: 95.223.XX.YY
Jan 25 11:30:25 router user.info syslog: WAN is up. IP: 95.223.XX.YY
I think I found the issue: DHCP ACKing my client, the DHCP server A.A.A.A specifies an invalid DHCP Server Identifier B.B.B.B. Upon lease renewal, the client then asks B.B.B.B for a lease extension, but B.B.B.B denies (NAK). The client then starts another DISCOVER to find new DHCP servers, A.A.A.A responds, and it starts over again.
Now comes the nifty difference between my Fedora box and DD-WRT: While Fedora still uses the old assigned IP address when DISCOVERing a new one (it still has half of its lease time!), DDWRT's DHCP client removes the IP address from the interface as soon as the NAK is received, waiting for the DISCOVER to succeed. As this typically takes ~3 seconds, there's no route to send outgoing packets anymore. This also explains why *inbound* traffic does still make it in my network.
Who is able to debug this problem? Can we fix this? When is the code for the DHCP client? How can I turn on logging?