New Kong's Test Build v3.0-r36070M kongac (31/05/2018)

Post new topic   Reply to topic    DD-WRT Forum Index -> Broadcom SoC based Hardware
Goto page Previous  1, 2, 3, 4, 5, 6  Next
Author Message
Corsair1274
DD-WRT User


Joined: 08 Jul 2012
Posts: 53

PostPosted: Sun Jun 10, 2018 21:31    Post subject: Reply with quote
Okay, let me change the settings and enable that, then I will report back the results. Thank you guys for the assistance. I jumped on the internet upgrade without even thinking as I thought these routers were compatible.
Sponsor
Corsair1274
DD-WRT User


Joined: 08 Jul 2012
Posts: 53

PostPosted: Sun Jun 10, 2018 22:54    Post subject: Reply with quote
I was able to get over 500 mbps (508) as my highest speed after enabling SFE on all routers.

In terms of a router upgrade I definite want to get something that I will not have to worry about upgrading again for a few years at least.

I'll definitely research anything else I can do to make improvements but once again Thanks again for the assistance!
RebootsDaMachina
DD-WRT Novice


Joined: 09 Oct 2016
Posts: 1

PostPosted: Mon Jun 11, 2018 0:32    Post subject: Netgear R6300v2 Router Report - DD-WRT v3.0-r36070M kongac Reply with quote
Router Model: Netgear R6300V2
Firmware Version: DD-WRT v3.0-r36070M kongac (05/31/18)
Kernel Version: Linux 4.4.134 #568 SMP Thu May 31 11:02:32 CEST 2018 armv7l
Status: Came up just fine. Open VPN (PIA) is fine. Speed is as expected.
Reset: No
Errors: None

Upgraded from: ddup --flash-latest using KiTTY Portable
Router Model: Netgear R6300V2
Firmware Version: DD-WRT v3.0-r35550M kongac (03/28/18)
Kernel Version: Linux 4.4.124 #548 SMP Wed Mar 28 09:52:34 CEST 2018 armv7l
CPU Model: Broadcom BCM4708
CPU Cores: 2
CPU Features: EDSP
CPU Clock: 800 MHz
Load Average: 2% 0.06, 0.05, 0.00
Temperatures: CPU 64.1 °C / WL0 45.1 °C
DHCP Server: Enabled - Running
Samba: Disabled
WRT-radauth: Disabled
WRT-rflow: Disabled
MAC-upd: Disabled
CIFS Automount: Disabled
USB Support: Disabled
WL0 Radio: Radio is On
Mode: AP
Network: Mixed
Channel: 1
TX Power: Auto
Rate: 78 Mbps
Encryption - Interface wl0: Enabled, WPA2 Personal
WL1: Disabled

All is well
LiskoFINAL
DD-WRT Novice


Joined: 03 Jul 2016
Posts: 16

PostPosted: Mon Jun 11, 2018 9:27    Post subject: Reply with quote
I have an r7000 and with this build the ipsec strongswan server is broken (again after waiting a lot for a fix). Connects successfully but no access to lan/wan. This time doesn't seem a dns problem because even when entering an ip it doesn't work. Works fine on previous build though. Also I noticed that sometimes, like when adding custom ports forwarding, the firewall rules for accepting ipsec connections from outside are discarded and is needed to manually press apply settings on vpn services page or manually open udp ports 500/4500.
kiva113
DD-WRT Novice


Joined: 11 Feb 2018
Posts: 1

PostPosted: Tue Jun 12, 2018 1:33    Post subject: Re: New Kong's Test Build v3.0-r36070M kongac (31/05/2018) Reply with quote
[quote="unixpunk"]
<Kong> wrote:
unixpunk wrote:
Router: Netgear R6400 v1(.0.31)

Firmware: New Kong's Test Build v3.0-r36070M kongac (31/05/2018)

Kernel: Linux 4.4.134 #568 SMP Thu May 31 11:02:32 CEST 2018 armv7l DD-WRT

Status: Freezes and somehow brings down entire network, even across multiple switches, really amazing actually considering my machine saw no extra/out of the ordinary packets in promisc w/wireshark yet the lan light was flashing at seizure-levels...

Reset: Yes

Errors: #1 - After a random amount of time httpd runs out of memory somehow and brings down the entire system and network. This has happened to me consistently on EVERY dd-wrt build I've tried so far. Have serial access, happy to debug/reproduce, etc. Happens with http and/or https enabled, only option I haven't tried yet is neither. Amount of time it takes is always random and different.

[Edit, added] #2 - Router always attempts tftp boot (boot wait) even if option is disabled, this adds 30 seconds to boot up time, easy.


******Serial output from boot
CFE for Foxconn Router R6400 version: v1.0.31
Build Date: Tue Apr 14 17:28:19 CST 2015
Init Arena
Init Devs.
Boot up from NAND flash...
Bootcode Boot partition size = 524288(0x80000)
DDR Clock: 533 MHz
Info: DDR frequency set from clkfreq=800,*533*
et0: Broadcom BCM47XX 10/100/1000 Mbps Ethernet Controller 6.37.15.1 (r407936)
et1: Broadcom BCM47XX 10/100/1000 Mbps Ethernet Controller 6.37.15.1 (r407936)
CPU type 0x0: 800MHz
Tot mem: 262144 KBytes


****************backtrace from serial console (repeats)
[ pid ] uid tgid total_vm rss nr_ptes nr_pmds swapents oom_score_adj name
[ 578] 0 578 188 93 4 0 0 0 hotplug2
[ 583] 0 583 201 111 3 0 0 0 mstpd
[ 853] 0 853 228 106 3 0 0 0 dropbear
[ 1094] 0 1094 320 110 3 0 0 0 inadyn
[ 1121] 0 1121 376 144 4 0 0 0 process_monitor
[ 1165] 0 1165 313 110 3 0 0 0 resetbutton
[ 1336] 0 1336 245 161 3 0 0 0 dropbear
[ 1337] 0 1337 298 135 3 0 0 0 sh
[ 6882] 0 6882 182 102 4 0 0 0 cron
[ 6964] 0 6964 360 165 3 0 0 0 wland
[ 7397] 0 7397 435 168 4 0 0 0 dnsmasq
[ 7398] 0 7398 435 127 3 0 0 0 dnsmasq
[ 7446] 0 7446 819 127 4 0 0 0 httpd
[ 7452] 0 7452 1074 789 5 0 0 0 httpd
[ 7571] 0 7571 594 210 4 0 0 0 radio_timer
[ 7581] 0 7581 374 154 4 0 0 0 nas
[ 7740] 0 7740 302 187 3 0 0 0 top
[23134] 0 23134 298 217 3 0 0 0 sh
Out of memory: Kill process 7452 (httpd) score 12 or sacrifice child
Killed process 23134 (sh) total-vm:1192kB, anon-rss:32kB, file-rss:836kB
httpd: page allocation failure: order:6, mode:0x2284020
CPU: 0 PID: 7452 Comm: httpd Tainted: P 4.4.134 #568
Hardware name: Northstar Prototype
Backtrace:
[<8001bae8>] (dump_backtrace) from [<8001bd70>] (show_stack+0x18/0x1c)
r7:00000006 r6:60000093 r5:00000000 r4:8051fa60
[<8001bd58>] (show_stack) from [<80169d9c>] (dump_stack+0x8c/0xa0)
[<80169d10>] (dump_stack) from [<800a0d78>] (warn_alloc_failed+0x110/0x120)
r7:00000006 r6:00000000 r5:00000006 r4:02284020
[<800a0c6c>] (warn_alloc_failed) from [<800a2cd4>] (__alloc_pages_nodemask+0x6ac/0x724)
r3:00000000 r2:00000000
r6:00000000 r5:00000000 r4:00000030
[<800a2628>] (__alloc_pages_nodemask) from [<800c8a74>] (cache_alloc_refill+0x2a8/0x4f0)
r10:804fe378 r9:00000000 r8:00000000 r7:878006c0 r6:00000000 r5:878013c0
r4:02080020
[<800c87cc>] (cache_alloc_refill) from [<800c8f7c>] (__kmalloc+0x84/0xe0)
r10:00000000 r9:00020000 r8:76b6b010 r7:87bc5f78 r6:a0000013 r5:02080020
r4:878006c0
[<800c8ef8>] (__kmalloc) from [<8002c76c>] (MMALLOC+0x1c/0x30)
r7:87bc5f78 r6:76b6b010 r5:8002d270 r4:00020001
[<8002c750>] (MMALLOC) from [<8002d298>] (dev_nvram_read+0x28/0x19c)
r5:8002d270 r4:00020000
[<8002d270>] (dev_nvram_read) from [<800cc31c>] (__vfs_read+0x34/0xcc)
r8:76b6b010 r7:87bc5f78 r6:00020000 r5:8002d270 r4:87126400
[<800cc2e8>] (__vfs_read) from [<800ccb08>] (vfs_read+0x80/0x100)
r9:00020000 r8:76b6b010 r7:00020000 r6:87bc5f78 r5:76b6b010 r4:87126400
[<800cca88>] (vfs_read) from [<800cd35c>] (SyS_read+0x44/0x84)
r9:00020000 r8:76b6b010 r7:00000000 r6:00000000 r5:87126400 r4:87126400
[<800cd318>] (SyS_read) from [<80009540>] (ret_fast_syscall+0x0/0x40)
r9:87bc4000 r8:80009704 r7:00000003 r6:06aae756 r5:00000000 r4:00000000
Mem-Info:
active_anon:531 inactive_anon:0 isolated_anon:0
active_file:927 inactive_file:353 isolated_file:0
unevictable:34 dirty:0 writeback:0 unstable:0
slab_reclaimable:161 slab_unreclaimable:1844
mapped:948 shmem:0 pagetables:67 bounce:0
free:31271 free_pcp:122 free_cma:0
Normal free:1464kB min:1400kB low:1748kB high:2100kB active_anon:360kB inactive_anon:0kB active_filo
lowmem_reserve[]: 0 1024 1024
HighMem free:123620kB min:128kB low:500kB high:872kB active_anon:1764kB inactive_anon:0kB active_fio
lowmem_reserve[]: 0 0 0
Normal: 53*4kB (UE) 77*8kB (UE) 41*16kB (UME) 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048B
HighMem: 22*4kB (UM) 22*8kB (UM) 14*16kB (UM) 30*32kB (UM) 17*64kB (UM) 4*128kB (UM) 1*256kB (M) 3*B
1317 total pagecache pages
0 pages in swap cache
Swap cache stats: add 0, delete 0, find 0/0
Free swap = 0kB
Total swap = 0kB
65536 pages RAM
32768 pages HighMem/MovableOnly
1924 pages reserved


*********top from when the crash happened
Mem: 127116K used, 127332K free, 0K shrd, 48K buff, 3808K cached
CPU: 1.4% usr 48.8% sys 0.0% nic 49.7% idle 0.0% io 0.0% irq 0.0% sirq
Load average: 0.19 0.12 0.04 4/72 23154
PID PPID USER STAT VSZ %VSZ CPU %CPU COMMAND
6882 1 root S 728 0.2 0 9.4 cron
197 2 root SW 0 0.0 1 6.1 [kswapd0]
7740 1337 root R 1208 0.4 1 2.8 top -d1
1097 2 root SW 0 0.0 0 1.4 [kworker/0:2]
152 2 root SW 0 0.0 1 0.9 [kworker/1:1]
1336 853 root S 980 0.3 0 0.4 dropbear -b /tmp/loginprompt -r /tmp/root/.ssh/ssh_hos
7 2 root SW 0 0.0 1 0.4 [rcu_sched]
7446 1 root S 3276 1.2 0 0.0 httpd -p 80
23151 1 root S 1504 0.5 0 0.0 /sbin/check_ps
1121 1 root S 1504 0.5 1 0.0 process_monitor
7581 1 root S 1496 0.5 1 0.0 nas -P /tmp/nas.wl0lan.pid -H 34954 -l br0 -i eth1 -A
6964 1 root S 1440 0.5 1 0.0 wland
1094 1 root S 1280 0.5 0 0.0 inadyn -u <REMOVED> -p <REMMOVED>
1165 1 root S 1252 0.4 0 0.0 resetbutton
23154 23151 root R 1192 0.4 0 0.0 [sh]
1337 1336 root S 1192 0.4 1 0.0 -sh
1 0 root S 1040 0.4 1 0.0 /sbin/init
853 1 root S 912 0.3 1 0.0 dropbear -b /tmp/loginprompt -r /tmp/root/.ssh/ssh_hos
583 1 root S 804 0.3 0 0.0 /sbin/mstpd
578 1 root S 752 0.2 1 0.0 /sbin/hotplug2 --set-rules-file /etc/hotplug2.rules --
23149 6882 root S 728 0.2 0 0.0 {cron} CRON
3 2 root RW 0 0.0 0 0.0 [ksoftirqd/0]
11 2 root SW 0 0.0 1 0.0 [ksoftirqd/1]
Killed 2 root SW 0 0.0 1 0.0 [migration/1]


This is not httpd problem, this is a nas (network authentication daemon) problem. I have seen this behavior once on my own router, nas will broadcast like crazy, which can also bring down other devices with bridges.
The httpd memory print out is most likely just a result of the overload, that these broadcast cause.

I have no idea how this is triggered as I was never able to reproduce it, just turning off/on the router fixed it and the same config has been running for month without the issue.

I recommend you fully clear the config with nvram erase if not done before configuring the router.


I had been struggling for weeks to figure out a problem almost identical to this on my R8500.

It started back in march (not sure if it was a firmware change) and I can get it to trigger consistently within 4-24 hours. Once it starts, it seems to triggers more frequently. I also occasionally noticed a high-pitched noise out of the R8500 in the times after it had crashed and rebooted on its own (with no power cycle).

Rebuilt it from scratch after erasing the nvram, shut off all unnecessary services, and the OOM issue continued.

I noticed a flood of DHCP discover/offers, sometimes within a second of the OOM getting triggered. These came from my Logitech Harmony Hub device primarily.
After setting a static entry in the DNS table that was outside of my normal DHCP range with a lease time, the DHCP floods went away.

However the OOM crashes continued.

After reading Kong's comment about the NAS daemon, I focused on the wireless network.

I had a AP/Bridge (different ssid using wl2 as the backhaul) and I removed that from the network and it made no difference

Finally, I moved my Harmony hub (& Nest) to the wireless network on the other side of my bridge and it has been stable ever since (so far around 32 hours)

I found an old note in the a fork of the 374.43 fork of Merlin that said:
"fixed NETWORKMAP hang with attached Logitech Harmony Hub" https://www.snbforums.com/threads/fork-asuswrt-merlin-374-43-lts-releases-v32e4.18914/
Not sure if it is related... I just may have a Logitech hardware issue.

My suggestion, look to see if you have a specific wireless device causing the issues.
unixpunk
DD-WRT Novice


Joined: 07 Jun 2018
Posts: 13

PostPosted: Tue Jun 12, 2018 2:47    Post subject: Re: New Kong's Test Build v3.0-r36070M kongac (31/05/2018) Reply with quote
[quote="kiva113"]
unixpunk wrote:
<Kong> wrote:
unixpunk wrote:
Router: Netgear R6400 v1(.0.31)

Firmware: New Kong's Test Build v3.0-r36070M kongac (31/05/2018)

Kernel: Linux 4.4.134 #568 SMP Thu May 31 11:02:32 CEST 2018 armv7l DD-WRT

Status: Freezes and somehow brings down entire network, even across multiple switches, really amazing actually considering my machine saw no extra/out of the ordinary packets in promisc w/wireshark yet the lan light was flashing at seizure-levels...

Reset: Yes

Errors: #1 - After a random amount of time httpd runs out of memory somehow and brings down the entire system and network. This has happened to me consistently on EVERY dd-wrt build I've tried so far. Have serial access, happy to debug/reproduce, etc. Happens with http and/or https enabled, only option I haven't tried yet is neither. Amount of time it takes is always random and different.

[Edit, added] #2 - Router always attempts tftp boot (boot wait) even if option is disabled, this adds 30 seconds to boot up time, easy.


******Serial output from boot
CFE for Foxconn Router R6400 version: v1.0.31
Build Date: Tue Apr 14 17:28:19 CST 2015
Init Arena
Init Devs.
Boot up from NAND flash...
Bootcode Boot partition size = 524288(0x80000)
DDR Clock: 533 MHz
Info: DDR frequency set from clkfreq=800,*533*
et0: Broadcom BCM47XX 10/100/1000 Mbps Ethernet Controller 6.37.15.1 (r407936)
et1: Broadcom BCM47XX 10/100/1000 Mbps Ethernet Controller 6.37.15.1 (r407936)
CPU type 0x0: 800MHz
Tot mem: 262144 KBytes


****************backtrace from serial console (repeats)
[ pid ] uid tgid total_vm rss nr_ptes nr_pmds swapents oom_score_adj name
[ 578] 0 578 188 93 4 0 0 0 hotplug2
[ 583] 0 583 201 111 3 0 0 0 mstpd
[ 853] 0 853 228 106 3 0 0 0 dropbear
[ 1094] 0 1094 320 110 3 0 0 0 inadyn
[ 1121] 0 1121 376 144 4 0 0 0 process_monitor
[ 1165] 0 1165 313 110 3 0 0 0 resetbutton
[ 1336] 0 1336 245 161 3 0 0 0 dropbear
[ 1337] 0 1337 298 135 3 0 0 0 sh
[ 6882] 0 6882 182 102 4 0 0 0 cron
[ 6964] 0 6964 360 165 3 0 0 0 wland
[ 7397] 0 7397 435 168 4 0 0 0 dnsmasq
[ 7398] 0 7398 435 127 3 0 0 0 dnsmasq
[ 7446] 0 7446 819 127 4 0 0 0 httpd
[ 7452] 0 7452 1074 789 5 0 0 0 httpd
[ 7571] 0 7571 594 210 4 0 0 0 radio_timer
[ 7581] 0 7581 374 154 4 0 0 0 nas
[ 7740] 0 7740 302 187 3 0 0 0 top
[23134] 0 23134 298 217 3 0 0 0 sh
Out of memory: Kill process 7452 (httpd) score 12 or sacrifice child
Killed process 23134 (sh) total-vm:1192kB, anon-rss:32kB, file-rss:836kB
httpd: page allocation failure: order:6, mode:0x2284020
CPU: 0 PID: 7452 Comm: httpd Tainted: P 4.4.134 #568
Hardware name: Northstar Prototype
Backtrace:
[<8001bae8>] (dump_backtrace) from [<8001bd70>] (show_stack+0x18/0x1c)
r7:00000006 r6:60000093 r5:00000000 r4:8051fa60
[<8001bd58>] (show_stack) from [<80169d9c>] (dump_stack+0x8c/0xa0)
[<80169d10>] (dump_stack) from [<800a0d78>] (warn_alloc_failed+0x110/0x120)
r7:00000006 r6:00000000 r5:00000006 r4:02284020
[<800a0c6c>] (warn_alloc_failed) from [<800a2cd4>] (__alloc_pages_nodemask+0x6ac/0x724)
r3:00000000 r2:00000000
r6:00000000 r5:00000000 r4:00000030
[<800a2628>] (__alloc_pages_nodemask) from [<800c8a74>] (cache_alloc_refill+0x2a8/0x4f0)
r10:804fe378 r9:00000000 r8:00000000 r7:878006c0 r6:00000000 r5:878013c0
r4:02080020
[<800c87cc>] (cache_alloc_refill) from [<800c8f7c>] (__kmalloc+0x84/0xe0)
r10:00000000 r9:00020000 r8:76b6b010 r7:87bc5f78 r6:a0000013 r5:02080020
r4:878006c0
[<800c8ef8>] (__kmalloc) from [<8002c76c>] (MMALLOC+0x1c/0x30)
r7:87bc5f78 r6:76b6b010 r5:8002d270 r4:00020001
[<8002c750>] (MMALLOC) from [<8002d298>] (dev_nvram_read+0x28/0x19c)
r5:8002d270 r4:00020000
[<8002d270>] (dev_nvram_read) from [<800cc31c>] (__vfs_read+0x34/0xcc)
r8:76b6b010 r7:87bc5f78 r6:00020000 r5:8002d270 r4:87126400
[<800cc2e8>] (__vfs_read) from [<800ccb08>] (vfs_read+0x80/0x100)
r9:00020000 r8:76b6b010 r7:00020000 r6:87bc5f78 r5:76b6b010 r4:87126400
[<800cca88>] (vfs_read) from [<800cd35c>] (SyS_read+0x44/0x84)
r9:00020000 r8:76b6b010 r7:00000000 r6:00000000 r5:87126400 r4:87126400
[<800cd318>] (SyS_read) from [<80009540>] (ret_fast_syscall+0x0/0x40)
r9:87bc4000 r8:80009704 r7:00000003 r6:06aae756 r5:00000000 r4:00000000
Mem-Info:
active_anon:531 inactive_anon:0 isolated_anon:0
active_file:927 inactive_file:353 isolated_file:0
unevictable:34 dirty:0 writeback:0 unstable:0
slab_reclaimable:161 slab_unreclaimable:1844
mapped:948 shmem:0 pagetables:67 bounce:0
free:31271 free_pcp:122 free_cma:0
Normal free:1464kB min:1400kB low:1748kB high:2100kB active_anon:360kB inactive_anon:0kB active_filo
lowmem_reserve[]: 0 1024 1024
HighMem free:123620kB min:128kB low:500kB high:872kB active_anon:1764kB inactive_anon:0kB active_fio
lowmem_reserve[]: 0 0 0
Normal: 53*4kB (UE) 77*8kB (UE) 41*16kB (UME) 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048B
HighMem: 22*4kB (UM) 22*8kB (UM) 14*16kB (UM) 30*32kB (UM) 17*64kB (UM) 4*128kB (UM) 1*256kB (M) 3*B
1317 total pagecache pages
0 pages in swap cache
Swap cache stats: add 0, delete 0, find 0/0
Free swap = 0kB
Total swap = 0kB
65536 pages RAM
32768 pages HighMem/MovableOnly
1924 pages reserved


*********top from when the crash happened
Mem: 127116K used, 127332K free, 0K shrd, 48K buff, 3808K cached
CPU: 1.4% usr 48.8% sys 0.0% nic 49.7% idle 0.0% io 0.0% irq 0.0% sirq
Load average: 0.19 0.12 0.04 4/72 23154
PID PPID USER STAT VSZ %VSZ CPU %CPU COMMAND
6882 1 root S 728 0.2 0 9.4 cron
197 2 root SW 0 0.0 1 6.1 [kswapd0]
7740 1337 root R 1208 0.4 1 2.8 top -d1
1097 2 root SW 0 0.0 0 1.4 [kworker/0:2]
152 2 root SW 0 0.0 1 0.9 [kworker/1:1]
1336 853 root S 980 0.3 0 0.4 dropbear -b /tmp/loginprompt -r /tmp/root/.ssh/ssh_hos
7 2 root SW 0 0.0 1 0.4 [rcu_sched]
7446 1 root S 3276 1.2 0 0.0 httpd -p 80
23151 1 root S 1504 0.5 0 0.0 /sbin/check_ps
1121 1 root S 1504 0.5 1 0.0 process_monitor
7581 1 root S 1496 0.5 1 0.0 nas -P /tmp/nas.wl0lan.pid -H 34954 -l br0 -i eth1 -A
6964 1 root S 1440 0.5 1 0.0 wland
1094 1 root S 1280 0.5 0 0.0 inadyn -u <REMOVED> -p <REMMOVED>
1165 1 root S 1252 0.4 0 0.0 resetbutton
23154 23151 root R 1192 0.4 0 0.0 [sh]
1337 1336 root S 1192 0.4 1 0.0 -sh
1 0 root S 1040 0.4 1 0.0 /sbin/init
853 1 root S 912 0.3 1 0.0 dropbear -b /tmp/loginprompt -r /tmp/root/.ssh/ssh_hos
583 1 root S 804 0.3 0 0.0 /sbin/mstpd
578 1 root S 752 0.2 1 0.0 /sbin/hotplug2 --set-rules-file /etc/hotplug2.rules --
23149 6882 root S 728 0.2 0 0.0 {cron} CRON
3 2 root RW 0 0.0 0 0.0 [ksoftirqd/0]
11 2 root SW 0 0.0 1 0.0 [ksoftirqd/1]
Killed 2 root SW 0 0.0 1 0.0 [migration/1]


This is not httpd problem, this is a nas (network authentication daemon) problem. I have seen this behavior once on my own router, nas will broadcast like crazy, which can also bring down other devices with bridges.
The httpd memory print out is most likely just a result of the overload, that these broadcast cause.

I have no idea how this is triggered as I was never able to reproduce it, just turning off/on the router fixed it and the same config has been running for month without the issue.

I recommend you fully clear the config with nvram erase if not done before configuring the router.


I had been struggling for weeks to figure out a problem almost identical to this on my R8500.

It started back in march (not sure if it was a firmware change) and I can get it to trigger consistently within 4-24 hours. Once it starts, it seems to triggers more frequently. I also occasionally noticed a high-pitched noise out of the R8500 in the times after it had crashed and rebooted on its own (with no power cycle).

Rebuilt it from scratch after erasing the nvram, shut off all unnecessary services, and the OOM issue continued.

I noticed a flood of DHCP discover/offers, sometimes within a second of the OOM getting triggered. These came from my Logitech Harmony Hub device primarily.
After setting a static entry in the DNS table that was outside of my normal DHCP range with a lease time, the DHCP floods went away.

However the OOM crashes continued.

After reading Kong's comment about the NAS daemon, I focused on the wireless network.

I had a AP/Bridge (different ssid using wl2 as the backhaul) and I removed that from the network and it made no difference

Finally, I moved my Harmony hub (& Nest) to the wireless network on the other side of my bridge and it has been stable ever since (so far around 32 hours)

I found an old note in the a fork of the 374.43 fork of Merlin that said:
"fixed NETWORKMAP hang with attached Logitech Harmony Hub" https://www.snbforums.com/threads/fork-asuswrt-merlin-374-43-lts-releases-v32e4.18914/
Not sure if it is related... I just may have a Logitech hardware issue.

My suggestion, look to see if you have a specific wireless device causing the issues.


Thanks for the info! I mostly have my wifi scheduled off (using cron and 'wl radio off' and I get the crash even then. Maybe because nas is still running at that point. Interesting find on the Logitech. I don't have that device but I will research to see what network protocol/apps it supports and maybe find some comparable device here, i.e., Avahi, etc.

I've also tried several different things short of disabling all wifi altogether. 1 hour dhcp leases, turning off various services, etc. I'll see tonight, I expect it to barf again.
Fobio
DD-WRT Novice


Joined: 19 Sep 2012
Posts: 45

PostPosted: Tue Jun 12, 2018 10:17    Post subject: Reply with quote
Router: DLink DIR-890L
Firmware: DD-WRT v3.0-r36070M kongac (05/31/2018)

WDS is not working. Had to revert back to 35550.
unixpunk
DD-WRT Novice


Joined: 07 Jun 2018
Posts: 13

PostPosted: Tue Jun 12, 2018 14:26    Post subject: Re: New Kong's Test Build v3.0-r36070M kongac (31/05/2018) Reply with quote
[quote="unixpunk"]
kiva113 wrote:
unixpunk wrote:
<Kong> wrote:
unixpunk wrote:
Router: Netgear R6400 v1(.0.31)

Firmware: New Kong's Test Build v3.0-r36070M kongac (31/05/2018)

Kernel: Linux 4.4.134 #568 SMP Thu May 31 11:02:32 CEST 2018 armv7l DD-WRT

Status: Freezes and somehow brings down entire network, even across multiple switches, really amazing actually considering my machine saw no extra/out of the ordinary packets in promisc w/wireshark yet the lan light was flashing at seizure-levels...

Reset: Yes

Errors: #1 - After a random amount of time httpd runs out of memory somehow and brings down the entire system and network. This has happened to me consistently on EVERY dd-wrt build I've tried so far. Have serial access, happy to debug/reproduce, etc. Happens with http and/or https enabled, only option I haven't tried yet is neither. Amount of time it takes is always random and different.

[Edit, added] #2 - Router always attempts tftp boot (boot wait) even if option is disabled, this adds 30 seconds to boot up time, easy.


******Serial output from boot
CFE for Foxconn Router R6400 version: v1.0.31
Build Date: Tue Apr 14 17:28:19 CST 2015
Init Arena
Init Devs.
Boot up from NAND flash...
Bootcode Boot partition size = 524288(0x80000)
DDR Clock: 533 MHz
Info: DDR frequency set from clkfreq=800,*533*
et0: Broadcom BCM47XX 10/100/1000 Mbps Ethernet Controller 6.37.15.1 (r407936)
et1: Broadcom BCM47XX 10/100/1000 Mbps Ethernet Controller 6.37.15.1 (r407936)
CPU type 0x0: 800MHz
Tot mem: 262144 KBytes


****************backtrace from serial console (repeats)
[ pid ] uid tgid total_vm rss nr_ptes nr_pmds swapents oom_score_adj name
[ 578] 0 578 188 93 4 0 0 0 hotplug2
[ 583] 0 583 201 111 3 0 0 0 mstpd
[ 853] 0 853 228 106 3 0 0 0 dropbear
[ 1094] 0 1094 320 110 3 0 0 0 inadyn
[ 1121] 0 1121 376 144 4 0 0 0 process_monitor
[ 1165] 0 1165 313 110 3 0 0 0 resetbutton
[ 1336] 0 1336 245 161 3 0 0 0 dropbear
[ 1337] 0 1337 298 135 3 0 0 0 sh
[ 6882] 0 6882 182 102 4 0 0 0 cron
[ 6964] 0 6964 360 165 3 0 0 0 wland
[ 7397] 0 7397 435 168 4 0 0 0 dnsmasq
[ 7398] 0 7398 435 127 3 0 0 0 dnsmasq
[ 7446] 0 7446 819 127 4 0 0 0 httpd
[ 7452] 0 7452 1074 789 5 0 0 0 httpd
[ 7571] 0 7571 594 210 4 0 0 0 radio_timer
[ 7581] 0 7581 374 154 4 0 0 0 nas
[ 7740] 0 7740 302 187 3 0 0 0 top
[23134] 0 23134 298 217 3 0 0 0 sh
Out of memory: Kill process 7452 (httpd) score 12 or sacrifice child
Killed process 23134 (sh) total-vm:1192kB, anon-rss:32kB, file-rss:836kB
httpd: page allocation failure: order:6, mode:0x2284020
CPU: 0 PID: 7452 Comm: httpd Tainted: P 4.4.134 #568
Hardware name: Northstar Prototype
Backtrace:
[<8001bae8>] (dump_backtrace) from [<8001bd70>] (show_stack+0x18/0x1c)
r7:00000006 r6:60000093 r5:00000000 r4:8051fa60
[<8001bd58>] (show_stack) from [<80169d9c>] (dump_stack+0x8c/0xa0)
[<80169d10>] (dump_stack) from [<800a0d78>] (warn_alloc_failed+0x110/0x120)
r7:00000006 r6:00000000 r5:00000006 r4:02284020
[<800a0c6c>] (warn_alloc_failed) from [<800a2cd4>] (__alloc_pages_nodemask+0x6ac/0x724)
r3:00000000 r2:00000000
r6:00000000 r5:00000000 r4:00000030
[<800a2628>] (__alloc_pages_nodemask) from [<800c8a74>] (cache_alloc_refill+0x2a8/0x4f0)
r10:804fe378 r9:00000000 r8:00000000 r7:878006c0 r6:00000000 r5:878013c0
r4:02080020
[<800c87cc>] (cache_alloc_refill) from [<800c8f7c>] (__kmalloc+0x84/0xe0)
r10:00000000 r9:00020000 r8:76b6b010 r7:87bc5f78 r6:a0000013 r5:02080020
r4:878006c0
[<800c8ef8>] (__kmalloc) from [<8002c76c>] (MMALLOC+0x1c/0x30)
r7:87bc5f78 r6:76b6b010 r5:8002d270 r4:00020001
[<8002c750>] (MMALLOC) from [<8002d298>] (dev_nvram_read+0x28/0x19c)
r5:8002d270 r4:00020000
[<8002d270>] (dev_nvram_read) from [<800cc31c>] (__vfs_read+0x34/0xcc)
r8:76b6b010 r7:87bc5f78 r6:00020000 r5:8002d270 r4:87126400
[<800cc2e8>] (__vfs_read) from [<800ccb08>] (vfs_read+0x80/0x100)
r9:00020000 r8:76b6b010 r7:00020000 r6:87bc5f78 r5:76b6b010 r4:87126400
[<800cca88>] (vfs_read) from [<800cd35c>] (SyS_read+0x44/0x84)
r9:00020000 r8:76b6b010 r7:00000000 r6:00000000 r5:87126400 r4:87126400
[<800cd318>] (SyS_read) from [<80009540>] (ret_fast_syscall+0x0/0x40)
r9:87bc4000 r8:80009704 r7:00000003 r6:06aae756 r5:00000000 r4:00000000
Mem-Info:
active_anon:531 inactive_anon:0 isolated_anon:0
active_file:927 inactive_file:353 isolated_file:0
unevictable:34 dirty:0 writeback:0 unstable:0
slab_reclaimable:161 slab_unreclaimable:1844
mapped:948 shmem:0 pagetables:67 bounce:0
free:31271 free_pcp:122 free_cma:0
Normal free:1464kB min:1400kB low:1748kB high:2100kB active_anon:360kB inactive_anon:0kB active_filo
lowmem_reserve[]: 0 1024 1024
HighMem free:123620kB min:128kB low:500kB high:872kB active_anon:1764kB inactive_anon:0kB active_fio
lowmem_reserve[]: 0 0 0
Normal: 53*4kB (UE) 77*8kB (UE) 41*16kB (UME) 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048B
HighMem: 22*4kB (UM) 22*8kB (UM) 14*16kB (UM) 30*32kB (UM) 17*64kB (UM) 4*128kB (UM) 1*256kB (M) 3*B
1317 total pagecache pages
0 pages in swap cache
Swap cache stats: add 0, delete 0, find 0/0
Free swap = 0kB
Total swap = 0kB
65536 pages RAM
32768 pages HighMem/MovableOnly
1924 pages reserved


*********top from when the crash happened
Mem: 127116K used, 127332K free, 0K shrd, 48K buff, 3808K cached
CPU: 1.4% usr 48.8% sys 0.0% nic 49.7% idle 0.0% io 0.0% irq 0.0% sirq
Load average: 0.19 0.12 0.04 4/72 23154
PID PPID USER STAT VSZ %VSZ CPU %CPU COMMAND
6882 1 root S 728 0.2 0 9.4 cron
197 2 root SW 0 0.0 1 6.1 [kswapd0]
7740 1337 root R 1208 0.4 1 2.8 top -d1
1097 2 root SW 0 0.0 0 1.4 [kworker/0:2]
152 2 root SW 0 0.0 1 0.9 [kworker/1:1]
1336 853 root S 980 0.3 0 0.4 dropbear -b /tmp/loginprompt -r /tmp/root/.ssh/ssh_hos
7 2 root SW 0 0.0 1 0.4 [rcu_sched]
7446 1 root S 3276 1.2 0 0.0 httpd -p 80
23151 1 root S 1504 0.5 0 0.0 /sbin/check_ps
1121 1 root S 1504 0.5 1 0.0 process_monitor
7581 1 root S 1496 0.5 1 0.0 nas -P /tmp/nas.wl0lan.pid -H 34954 -l br0 -i eth1 -A
6964 1 root S 1440 0.5 1 0.0 wland
1094 1 root S 1280 0.5 0 0.0 inadyn -u <REMOVED> -p <REMMOVED>
1165 1 root S 1252 0.4 0 0.0 resetbutton
23154 23151 root R 1192 0.4 0 0.0 [sh]
1337 1336 root S 1192 0.4 1 0.0 -sh
1 0 root S 1040 0.4 1 0.0 /sbin/init
853 1 root S 912 0.3 1 0.0 dropbear -b /tmp/loginprompt -r /tmp/root/.ssh/ssh_hos
583 1 root S 804 0.3 0 0.0 /sbin/mstpd
578 1 root S 752 0.2 1 0.0 /sbin/hotplug2 --set-rules-file /etc/hotplug2.rules --
23149 6882 root S 728 0.2 0 0.0 {cron} CRON
3 2 root RW 0 0.0 0 0.0 [ksoftirqd/0]
11 2 root SW 0 0.0 1 0.0 [ksoftirqd/1]
Killed 2 root SW 0 0.0 1 0.0 [migration/1]


This is not httpd problem, this is a nas (network authentication daemon) problem. I have seen this behavior once on my own router, nas will broadcast like crazy, which can also bring down other devices with bridges.
The httpd memory print out is most likely just a result of the overload, that these broadcast cause.

I have no idea how this is triggered as I was never able to reproduce it, just turning off/on the router fixed it and the same config has been running for month without the issue.

I recommend you fully clear the config with nvram erase if not done before configuring the router.


I had been struggling for weeks to figure out a problem almost identical to this on my R8500.

It started back in march (not sure if it was a firmware change) and I can get it to trigger consistently within 4-24 hours. Once it starts, it seems to triggers more frequently. I also occasionally noticed a high-pitched noise out of the R8500 in the times after it had crashed and rebooted on its own (with no power cycle).

Rebuilt it from scratch after erasing the nvram, shut off all unnecessary services, and the OOM issue continued.

I noticed a flood of DHCP discover/offers, sometimes within a second of the OOM getting triggered. These came from my Logitech Harmony Hub device primarily.
After setting a static entry in the DNS table that was outside of my normal DHCP range with a lease time, the DHCP floods went away.

However the OOM crashes continued.

After reading Kong's comment about the NAS daemon, I focused on the wireless network.

I had a AP/Bridge (different ssid using wl2 as the backhaul) and I removed that from the network and it made no difference

Finally, I moved my Harmony hub (& Nest) to the wireless network on the other side of my bridge and it has been stable ever since (so far around 32 hours)

I found an old note in the a fork of the 374.43 fork of Merlin that said:
"fixed NETWORKMAP hang with attached Logitech Harmony Hub" https://www.snbforums.com/threads/fork-asuswrt-merlin-374-43-lts-releases-v32e4.18914/
Not sure if it is related... I just may have a Logitech hardware issue.

My suggestion, look to see if you have a specific wireless device causing the issues.


Thanks for the info! I mostly have my wifi scheduled off (using cron and 'wl radio off' and I get the crash even then. Maybe because nas is still running at that point. Interesting find on the Logitech. I don't have that device but I will research to see what network protocol/apps it supports and maybe find some comparable device here, i.e., Avahi, etc.

I've also tried several different things short of disabling all wifi altogether. 1 hour dhcp leases, turning off various services, etc. I'll see tonight, I expect it to barf again.


Update, it seems that this is really only currently happening when the radio is off via 'wl radio off' command. I only use 2.4ghz (WPA2 Personal,AES), so 5ghz is disabled in the UI. I use cron to schedule turning wifi on and off because I need different schedules for different days and last I checked the radio scheduler doesn't offer that.

So if anyone is willing to test, run the command 'wl radio off' and wait up to 2 hours or so and see if things are hosed up.

I think I might be able to find a workaround where I just kill the nas process when i turn off the radio and then restart it when turning the radio back on. Or maybe there's a better way to turn off the wifi which kills nas automatically as well?
realbbb
DD-WRT Novice


Joined: 25 May 2014
Posts: 36

PostPosted: Tue Jun 12, 2018 22:35    Post subject: Reply with quote
Router Model: Asus RT-5300
Firmware Version: DD-WRT v3.0-r36070M kongac (05/31/2018)
Kernel Version: Linux 4.4.134 #568 SMP Thu May 31 11:02:32 CEST 2018 armv7l
Reset: Yes
Status: Lasted about two days before radio issues. Back to DD-WRT v3.0-r35030M kongac (02/19/18). For my setup, no other recent versions have worked as well.

WL0 - 2.4ghz - ap
wl1 - 5ghz - ap
wl2 - 5ghz - client to hotspot

Kong and BrainSlayer, thank you!

UPDATE - Gave it another go. Seen too much packet loss after a couple days before. This time with bluetooth coexistence enabled. Seems to be working. Will continue to test.
UPDATE#2 - Have gone back to 2/19/2018.


­¡BBB!
Health is Wealth

_________________
Asus RT-AC5300, Netgear R8500, Netgear R8000, Netgear R6100, Linksys E2500, Linksys E2000, Netgear R7000, Netgear WNDR4500, Netgear WNDR3800, Linksys WRT54Gv4, Linksys WRT54Gv1.1


Last edited by realbbb on Thu Jul 19, 2018 17:56; edited 3 times in total
unixpunk
DD-WRT Novice


Joined: 07 Jun 2018
Posts: 13

PostPosted: Wed Jun 13, 2018 3:16    Post subject: Re: New Kong's Test Build v3.0-r36070M kongac (31/05/2018) Reply with quote
[quote="unixpunk"]
unixpunk wrote:
kiva113 wrote:
unixpunk wrote:
<Kong> wrote:
unixpunk wrote:
Router: Netgear R6400 v1(.0.31)

Firmware: New Kong's Test Build v3.0-r36070M kongac (31/05/2018)

Kernel: Linux 4.4.134 #568 SMP Thu May 31 11:02:32 CEST 2018 armv7l DD-WRT

Status: Freezes and somehow brings down entire network, even across multiple switches, really amazing actually considering my machine saw no extra/out of the ordinary packets in promisc w/wireshark yet the lan light was flashing at seizure-levels...

Reset: Yes

Errors: #1 - After a random amount of time httpd runs out of memory somehow and brings down the entire system and network. This has happened to me consistently on EVERY dd-wrt build I've tried so far. Have serial access, happy to debug/reproduce, etc. Happens with http and/or https enabled, only option I haven't tried yet is neither. Amount of time it takes is always random and different.

[Edit, added] #2 - Router always attempts tftp boot (boot wait) even if option is disabled, this adds 30 seconds to boot up time, easy.


******Serial output from boot
CFE for Foxconn Router R6400 version: v1.0.31
Build Date: Tue Apr 14 17:28:19 CST 2015
Init Arena
Init Devs.
Boot up from NAND flash...
Bootcode Boot partition size = 524288(0x80000)
DDR Clock: 533 MHz
Info: DDR frequency set from clkfreq=800,*533*
et0: Broadcom BCM47XX 10/100/1000 Mbps Ethernet Controller 6.37.15.1 (r407936)
et1: Broadcom BCM47XX 10/100/1000 Mbps Ethernet Controller 6.37.15.1 (r407936)
CPU type 0x0: 800MHz
Tot mem: 262144 KBytes


****************backtrace from serial console (repeats)
[ pid ] uid tgid total_vm rss nr_ptes nr_pmds swapents oom_score_adj name
[ 578] 0 578 188 93 4 0 0 0 hotplug2
[ 583] 0 583 201 111 3 0 0 0 mstpd
[ 853] 0 853 228 106 3 0 0 0 dropbear
[ 1094] 0 1094 320 110 3 0 0 0 inadyn
[ 1121] 0 1121 376 144 4 0 0 0 process_monitor
[ 1165] 0 1165 313 110 3 0 0 0 resetbutton
[ 1336] 0 1336 245 161 3 0 0 0 dropbear
[ 1337] 0 1337 298 135 3 0 0 0 sh
[ 6882] 0 6882 182 102 4 0 0 0 cron
[ 6964] 0 6964 360 165 3 0 0 0 wland
[ 7397] 0 7397 435 168 4 0 0 0 dnsmasq
[ 7398] 0 7398 435 127 3 0 0 0 dnsmasq
[ 7446] 0 7446 819 127 4 0 0 0 httpd
[ 7452] 0 7452 1074 789 5 0 0 0 httpd
[ 7571] 0 7571 594 210 4 0 0 0 radio_timer
[ 7581] 0 7581 374 154 4 0 0 0 nas
[ 7740] 0 7740 302 187 3 0 0 0 top
[23134] 0 23134 298 217 3 0 0 0 sh
Out of memory: Kill process 7452 (httpd) score 12 or sacrifice child
Killed process 23134 (sh) total-vm:1192kB, anon-rss:32kB, file-rss:836kB
httpd: page allocation failure: order:6, mode:0x2284020
CPU: 0 PID: 7452 Comm: httpd Tainted: P 4.4.134 #568
Hardware name: Northstar Prototype
Backtrace:
[<8001bae8>] (dump_backtrace) from [<8001bd70>] (show_stack+0x18/0x1c)
r7:00000006 r6:60000093 r5:00000000 r4:8051fa60
[<8001bd58>] (show_stack) from [<80169d9c>] (dump_stack+0x8c/0xa0)
[<80169d10>] (dump_stack) from [<800a0d78>] (warn_alloc_failed+0x110/0x120)
r7:00000006 r6:00000000 r5:00000006 r4:02284020
[<800a0c6c>] (warn_alloc_failed) from [<800a2cd4>] (__alloc_pages_nodemask+0x6ac/0x724)
r3:00000000 r2:00000000
r6:00000000 r5:00000000 r4:00000030
[<800a2628>] (__alloc_pages_nodemask) from [<800c8a74>] (cache_alloc_refill+0x2a8/0x4f0)
r10:804fe378 r9:00000000 r8:00000000 r7:878006c0 r6:00000000 r5:878013c0
r4:02080020
[<800c87cc>] (cache_alloc_refill) from [<800c8f7c>] (__kmalloc+0x84/0xe0)
r10:00000000 r9:00020000 r8:76b6b010 r7:87bc5f78 r6:a0000013 r5:02080020
r4:878006c0
[<800c8ef8>] (__kmalloc) from [<8002c76c>] (MMALLOC+0x1c/0x30)
r7:87bc5f78 r6:76b6b010 r5:8002d270 r4:00020001
[<8002c750>] (MMALLOC) from [<8002d298>] (dev_nvram_read+0x28/0x19c)
r5:8002d270 r4:00020000
[<8002d270>] (dev_nvram_read) from [<800cc31c>] (__vfs_read+0x34/0xcc)
r8:76b6b010 r7:87bc5f78 r6:00020000 r5:8002d270 r4:87126400
[<800cc2e8>] (__vfs_read) from [<800ccb08>] (vfs_read+0x80/0x100)
r9:00020000 r8:76b6b010 r7:00020000 r6:87bc5f78 r5:76b6b010 r4:87126400
[<800cca88>] (vfs_read) from [<800cd35c>] (SyS_read+0x44/0x84)
r9:00020000 r8:76b6b010 r7:00000000 r6:00000000 r5:87126400 r4:87126400
[<800cd318>] (SyS_read) from [<80009540>] (ret_fast_syscall+0x0/0x40)
r9:87bc4000 r8:80009704 r7:00000003 r6:06aae756 r5:00000000 r4:00000000
Mem-Info:
active_anon:531 inactive_anon:0 isolated_anon:0
active_file:927 inactive_file:353 isolated_file:0
unevictable:34 dirty:0 writeback:0 unstable:0
slab_reclaimable:161 slab_unreclaimable:1844
mapped:948 shmem:0 pagetables:67 bounce:0
free:31271 free_pcp:122 free_cma:0
Normal free:1464kB min:1400kB low:1748kB high:2100kB active_anon:360kB inactive_anon:0kB active_filo
lowmem_reserve[]: 0 1024 1024
HighMem free:123620kB min:128kB low:500kB high:872kB active_anon:1764kB inactive_anon:0kB active_fio
lowmem_reserve[]: 0 0 0
Normal: 53*4kB (UE) 77*8kB (UE) 41*16kB (UME) 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048B
HighMem: 22*4kB (UM) 22*8kB (UM) 14*16kB (UM) 30*32kB (UM) 17*64kB (UM) 4*128kB (UM) 1*256kB (M) 3*B
1317 total pagecache pages
0 pages in swap cache
Swap cache stats: add 0, delete 0, find 0/0
Free swap = 0kB
Total swap = 0kB
65536 pages RAM
32768 pages HighMem/MovableOnly
1924 pages reserved


*********top from when the crash happened
Mem: 127116K used, 127332K free, 0K shrd, 48K buff, 3808K cached
CPU: 1.4% usr 48.8% sys 0.0% nic 49.7% idle 0.0% io 0.0% irq 0.0% sirq
Load average: 0.19 0.12 0.04 4/72 23154
PID PPID USER STAT VSZ %VSZ CPU %CPU COMMAND
6882 1 root S 728 0.2 0 9.4 cron
197 2 root SW 0 0.0 1 6.1 [kswapd0]
7740 1337 root R 1208 0.4 1 2.8 top -d1
1097 2 root SW 0 0.0 0 1.4 [kworker/0:2]
152 2 root SW 0 0.0 1 0.9 [kworker/1:1]
1336 853 root S 980 0.3 0 0.4 dropbear -b /tmp/loginprompt -r /tmp/root/.ssh/ssh_hos
7 2 root SW 0 0.0 1 0.4 [rcu_sched]
7446 1 root S 3276 1.2 0 0.0 httpd -p 80
23151 1 root S 1504 0.5 0 0.0 /sbin/check_ps
1121 1 root S 1504 0.5 1 0.0 process_monitor
7581 1 root S 1496 0.5 1 0.0 nas -P /tmp/nas.wl0lan.pid -H 34954 -l br0 -i eth1 -A
6964 1 root S 1440 0.5 1 0.0 wland
1094 1 root S 1280 0.5 0 0.0 inadyn -u <REMOVED> -p <REMMOVED>
1165 1 root S 1252 0.4 0 0.0 resetbutton
23154 23151 root R 1192 0.4 0 0.0 [sh]
1337 1336 root S 1192 0.4 1 0.0 -sh
1 0 root S 1040 0.4 1 0.0 /sbin/init
853 1 root S 912 0.3 1 0.0 dropbear -b /tmp/loginprompt -r /tmp/root/.ssh/ssh_hos
583 1 root S 804 0.3 0 0.0 /sbin/mstpd
578 1 root S 752 0.2 1 0.0 /sbin/hotplug2 --set-rules-file /etc/hotplug2.rules --
23149 6882 root S 728 0.2 0 0.0 {cron} CRON
3 2 root RW 0 0.0 0 0.0 [ksoftirqd/0]
11 2 root SW 0 0.0 1 0.0 [ksoftirqd/1]
Killed 2 root SW 0 0.0 1 0.0 [migration/1]


This is not httpd problem, this is a nas (network authentication daemon) problem. I have seen this behavior once on my own router, nas will broadcast like crazy, which can also bring down other devices with bridges.
The httpd memory print out is most likely just a result of the overload, that these broadcast cause.

I have no idea how this is triggered as I was never able to reproduce it, just turning off/on the router fixed it and the same config has been running for month without the issue.

I recommend you fully clear the config with nvram erase if not done before configuring the router.


I had been struggling for weeks to figure out a problem almost identical to this on my R8500.

It started back in march (not sure if it was a firmware change) and I can get it to trigger consistently within 4-24 hours. Once it starts, it seems to triggers more frequently. I also occasionally noticed a high-pitched noise out of the R8500 in the times after it had crashed and rebooted on its own (with no power cycle).

Rebuilt it from scratch after erasing the nvram, shut off all unnecessary services, and the OOM issue continued.

I noticed a flood of DHCP discover/offers, sometimes within a second of the OOM getting triggered. These came from my Logitech Harmony Hub device primarily.
After setting a static entry in the DNS table that was outside of my normal DHCP range with a lease time, the DHCP floods went away.

However the OOM crashes continued.

After reading Kong's comment about the NAS daemon, I focused on the wireless network.

I had a AP/Bridge (different ssid using wl2 as the backhaul) and I removed that from the network and it made no difference

Finally, I moved my Harmony hub (& Nest) to the wireless network on the other side of my bridge and it has been stable ever since (so far around 32 hours)

I found an old note in the a fork of the 374.43 fork of Merlin that said:
"fixed NETWORKMAP hang with attached Logitech Harmony Hub" https://www.snbforums.com/threads/fork-asuswrt-merlin-374-43-lts-releases-v32e4.18914/
Not sure if it is related... I just may have a Logitech hardware issue.

My suggestion, look to see if you have a specific wireless device causing the issues.


Thanks for the info! I mostly have my wifi scheduled off (using cron and 'wl radio off' and I get the crash even then. Maybe because nas is still running at that point. Interesting find on the Logitech. I don't have that device but I will research to see what network protocol/apps it supports and maybe find some comparable device here, i.e., Avahi, etc.

I've also tried several different things short of disabling all wifi altogether. 1 hour dhcp leases, turning off various services, etc. I'll see tonight, I expect it to barf again.


Update, it seems that this is really only currently happening when the radio is off via 'wl radio off' command. I only use 2.4ghz (WPA2 Personal,AES), so 5ghz is disabled in the UI. I use cron to schedule turning wifi on and off because I need different schedules for different days and last I checked the radio scheduler doesn't offer that.

So if anyone is willing to test, run the command 'wl radio off' and wait up to 2 hours or so and see if things are hosed up.

I think I might be able to find a workaround where I just kill the nas process when i turn off the radio and then restart it when turning the radio back on. Or maybe there's a better way to turn off the wifi which kills nas automatically as well?


Here is my attempted workaround. I'll know by morning if it helped or not. First attempt, probably a more elegant way I've yet to discover. Instead of just wl radio on or wl radio off:

/usr/sbin/wl radio off && ps | grep nas | grep -v grep | awk '{ $1=$2=$3=$4=""; print $0 }' | sed 's/ //' >/tmp/nas.save && chmod 600 /tmp/nas.save && kill `cat /tmp/nas.wl*lan.pid`

/usr/sbin/wl radio on && sh /tmp/nas.save && rm -f /tmp/nas.save

Thanks all for the eyes on this!
unixpunk
DD-WRT Novice


Joined: 07 Jun 2018
Posts: 13

PostPosted: Wed Jun 13, 2018 3:40    Post subject: Re: New Kong's Test Build v3.0-r36070M kongac (31/05/2018) Reply with quote
[quote="unixpunk"]
unixpunk wrote:
unixpunk wrote:
kiva113 wrote:
unixpunk wrote:
<Kong> wrote:
unixpunk wrote:
Router: Netgear R6400 v1(.0.31)

Firmware: New Kong's Test Build v3.0-r36070M kongac (31/05/2018)

Kernel: Linux 4.4.134 #568 SMP Thu May 31 11:02:32 CEST 2018 armv7l DD-WRT

Status: Freezes and somehow brings down entire network, even across multiple switches, really amazing actually considering my machine saw no extra/out of the ordinary packets in promisc w/wireshark yet the lan light was flashing at seizure-levels...

Reset: Yes

Errors: #1 - After a random amount of time httpd runs out of memory somehow and brings down the entire system and network. This has happened to me consistently on EVERY dd-wrt build I've tried so far. Have serial access, happy to debug/reproduce, etc. Happens with http and/or https enabled, only option I haven't tried yet is neither. Amount of time it takes is always random and different.

[Edit, added] #2 - Router always attempts tftp boot (boot wait) even if option is disabled, this adds 30 seconds to boot up time, easy.


******Serial output from boot
CFE for Foxconn Router R6400 version: v1.0.31
Build Date: Tue Apr 14 17:28:19 CST 2015
Init Arena
Init Devs.
Boot up from NAND flash...
Bootcode Boot partition size = 524288(0x80000)
DDR Clock: 533 MHz
Info: DDR frequency set from clkfreq=800,*533*
et0: Broadcom BCM47XX 10/100/1000 Mbps Ethernet Controller 6.37.15.1 (r407936)
et1: Broadcom BCM47XX 10/100/1000 Mbps Ethernet Controller 6.37.15.1 (r407936)
CPU type 0x0: 800MHz
Tot mem: 262144 KBytes

[TRUNCATED TO SAVE SPACE]


This is not httpd problem, this is a nas (network authentication daemon) problem. I have seen this behavior once on my own router, nas will broadcast like crazy, which can also bring down other devices with bridges.
The httpd memory print out is most likely just a result of the overload, that these broadcast cause.

I have no idea how this is triggered as I was never able to reproduce it, just turning off/on the router fixed it and the same config has been running for month without the issue.

I recommend you fully clear the config with nvram erase if not done before configuring the router.


I had been struggling for weeks to figure out a problem almost identical to this on my R8500.

It started back in march (not sure if it was a firmware change) and I can get it to trigger consistently within 4-24 hours. Once it starts, it seems to triggers more frequently. I also occasionally noticed a high-pitched noise out of the R8500 in the times after it had crashed and rebooted on its own (with no power cycle).

Rebuilt it from scratch after erasing the nvram, shut off all unnecessary services, and the OOM issue continued.

I noticed a flood of DHCP discover/offers, sometimes within a second of the OOM getting triggered. These came from my Logitech Harmony Hub device primarily.
After setting a static entry in the DNS table that was outside of my normal DHCP range with a lease time, the DHCP floods went away.

However the OOM crashes continued.

After reading Kong's comment about the NAS daemon, I focused on the wireless network.

I had a AP/Bridge (different ssid using wl2 as the backhaul) and I removed that from the network and it made no difference

Finally, I moved my Harmony hub (& Nest) to the wireless network on the other side of my bridge and it has been stable ever since (so far around 32 hours)

I found an old note in the a fork of the 374.43 fork of Merlin that said:
"fixed NETWORKMAP hang with attached Logitech Harmony Hub" https://www.snbforums.com/threads/fork-asuswrt-merlin-374-43-lts-releases-v32e4.18914/
Not sure if it is related... I just may have a Logitech hardware issue.

My suggestion, look to see if you have a specific wireless device causing the issues.


Thanks for the info! I mostly have my wifi scheduled off (using cron and 'wl radio off' and I get the crash even then. Maybe because nas is still running at that point. Interesting find on the Logitech. I don't have that device but I will research to see what network protocol/apps it supports and maybe find some comparable device here, i.e., Avahi, etc.

I've also tried several different things short of disabling all wifi altogether. 1 hour dhcp leases, turning off various services, etc. I'll see tonight, I expect it to barf again.


Update, it seems that this is really only currently happening when the radio is off via 'wl radio off' command. I only use 2.4ghz (WPA2 Personal,AES), so 5ghz is disabled in the UI. I use cron to schedule turning wifi on and off because I need different schedules for different days and last I checked the radio scheduler doesn't offer that.

So if anyone is willing to test, run the command 'wl radio off' and wait up to 2 hours or so and see if things are hosed up.

I think I might be able to find a workaround where I just kill the nas process when i turn off the radio and then restart it when turning the radio back on. Or maybe there's a better way to turn off the wifi which kills nas automatically as well?


Here is my attempted workaround. I'll know by morning if it helped or not. First attempt, probably a more elegant way I've yet to discover. Instead of just wl radio on or wl radio off:

/usr/sbin/wl radio off && ps | grep nas | grep -v grep | awk '{ $1=$2=$3=$4=""; print $0 }' | sed 's/ //' >/tmp/nas.save && chmod 600 /tmp/nas.save && kill `cat /tmp/nas.wl*lan.pid`

/usr/sbin/wl radio on && sh /tmp/nas.save && rm -f /tmp/nas.save

Thanks all for the eyes on this!


I checked in /tmp/cron.d/cron_jobs and it looks like the webUI cron field can't properly parse commands with single-quotes. It causes it to replace them with backslashes...This means it won't work anyway...trying to escape with \ doesn't work either. I ended up having to put the crontab into a file and then use nvram set cron_jobs="`cat /tmp/crontabs'" [edit added] and updated /tmp/cron.d/cron_jobs. We'll see in the morning.
limerick_fr
DD-WRT User


Joined: 07 Nov 2012
Posts: 111

PostPosted: Wed Jun 13, 2018 6:35    Post subject: Reply with quote
12 days running on my R8000 with no Extender (EX2700) issue which usually arises after 7-8 days.

So I will tick this one as a one to keep as a backup for next upgrades;)
unixpunk
DD-WRT Novice


Joined: 07 Jun 2018
Posts: 13

PostPosted: Wed Jun 13, 2018 13:50    Post subject: Re: New Kong's Test Build v3.0-r36070M kongac (31/05/2018) Reply with quote
[quote="unixpunk"]
unixpunk wrote:
unixpunk wrote:
unixpunk wrote:
kiva113 wrote:
unixpunk wrote:
<Kong> wrote:
unixpunk wrote:
Router: Netgear R6400 v1(.0.31)

Firmware: New Kong's Test Build v3.0-r36070M kongac (31/05/2018)

Kernel: Linux 4.4.134 #568 SMP Thu May 31 11:02:32 CEST 2018 armv7l DD-WRT

Status: Freezes and somehow brings down entire network, even across multiple switches, really amazing actually considering my machine saw no extra/out of the ordinary packets in promisc w/wireshark yet the lan light was flashing at seizure-levels...

Reset: Yes

Errors: #1 - After a random amount of time httpd runs out of memory somehow and brings down the entire system and network. This has happened to me consistently on EVERY dd-wrt build I've tried so far. Have serial access, happy to debug/reproduce, etc. Happens with http and/or https enabled, only option I haven't tried yet is neither. Amount of time it takes is always random and different.

[Edit, added] #2 - Router always attempts tftp boot (boot wait) even if option is disabled, this adds 30 seconds to boot up time, easy.


******Serial output from boot
CFE for Foxconn Router R6400 version: v1.0.31
Build Date: Tue Apr 14 17:28:19 CST 2015
Init Arena
Init Devs.
Boot up from NAND flash...
Bootcode Boot partition size = 524288(0x80000)
DDR Clock: 533 MHz
Info: DDR frequency set from clkfreq=800,*533*
et0: Broadcom BCM47XX 10/100/1000 Mbps Ethernet Controller 6.37.15.1 (r407936)
et1: Broadcom BCM47XX 10/100/1000 Mbps Ethernet Controller 6.37.15.1 (r407936)
CPU type 0x0: 800MHz
Tot mem: 262144 KBytes

[TRUNCATED TO SAVE SPACE]


This is not httpd problem, this is a nas (network authentication daemon) problem. I have seen this behavior once on my own router, nas will broadcast like crazy, which can also bring down other devices with bridges.
The httpd memory print out is most likely just a result of the overload, that these broadcast cause.

I have no idea how this is triggered as I was never able to reproduce it, just turning off/on the router fixed it and the same config has been running for month without the issue.

I recommend you fully clear the config with nvram erase if not done before configuring the router.


I had been struggling for weeks to figure out a problem almost identical to this on my R8500.

It started back in march (not sure if it was a firmware change) and I can get it to trigger consistently within 4-24 hours. Once it starts, it seems to triggers more frequently. I also occasionally noticed a high-pitched noise out of the R8500 in the times after it had crashed and rebooted on its own (with no power cycle).

Rebuilt it from scratch after erasing the nvram, shut off all unnecessary services, and the OOM issue continued.

I noticed a flood of DHCP discover/offers, sometimes within a second of the OOM getting triggered. These came from my Logitech Harmony Hub device primarily.
After setting a static entry in the DNS table that was outside of my normal DHCP range with a lease time, the DHCP floods went away.

However the OOM crashes continued.

After reading Kong's comment about the NAS daemon, I focused on the wireless network.

I had a AP/Bridge (different ssid using wl2 as the backhaul) and I removed that from the network and it made no difference

Finally, I moved my Harmony hub (& Nest) to the wireless network on the other side of my bridge and it has been stable ever since (so far around 32 hours)

I found an old note in the a fork of the 374.43 fork of Merlin that said:
"fixed NETWORKMAP hang with attached Logitech Harmony Hub" https://www.snbforums.com/threads/fork-asuswrt-merlin-374-43-lts-releases-v32e4.18914/
Not sure if it is related... I just may have a Logitech hardware issue.

My suggestion, look to see if you have a specific wireless device causing the issues.


Thanks for the info! I mostly have my wifi scheduled off (using cron and 'wl radio off' and I get the crash even then. Maybe because nas is still running at that point. Interesting find on the Logitech. I don't have that device but I will research to see what network protocol/apps it supports and maybe find some comparable device here, i.e., Avahi, etc.

I've also tried several different things short of disabling all wifi altogether. 1 hour dhcp leases, turning off various services, etc. I'll see tonight, I expect it to barf again.


Update, it seems that this is really only currently happening when the radio is off via 'wl radio off' command. I only use 2.4ghz (WPA2 Personal,AES), so 5ghz is disabled in the UI. I use cron to schedule turning wifi on and off because I need different schedules for different days and last I checked the radio scheduler doesn't offer that.

So if anyone is willing to test, run the command 'wl radio off' and wait up to 2 hours or so and see if things are hosed up.

I think I might be able to find a workaround where I just kill the nas process when i turn off the radio and then restart it when turning the radio back on. Or maybe there's a better way to turn off the wifi which kills nas automatically as well?


Here is my attempted workaround. I'll know by morning if it helped or not. First attempt, probably a more elegant way I've yet to discover. Instead of just wl radio on or wl radio off:

/usr/sbin/wl radio off && ps | grep nas | grep -v grep | awk '{ $1=$2=$3=$4=""; print $0 }' | sed 's/ //' >/tmp/nas.save && chmod 600 /tmp/nas.save && kill `cat /tmp/nas.wl*lan.pid`

/usr/sbin/wl radio on && sh /tmp/nas.save && rm -f /tmp/nas.save

Thanks all for the eyes on this!


I checked in /tmp/cron.d/cron_jobs and it looks like the webUI cron field can't properly parse commands with single-quotes. It causes it to replace them with backslashes...This means it won't work anyway...trying to escape with \ doesn't work either. I ended up having to put the crontab into a file and then use nvram set cron_jobs="`cat /tmp/crontabs'" [edit added] and updated /tmp/cron.d/cron_jobs. We'll see in the morning.


No change here even with nas process killed...are we sure this is related to nas or am I killing the wrong process here?

Any additional advice/direction is greatly appreciated! Thanks all!
<Kong>
DD-WRT Guru


Joined: 15 Dec 2010
Posts: 4339
Location: Germany

PostPosted: Wed Jun 13, 2018 14:04    Post subject: Reply with quote
Do not use wl radio on/off, radio off does not take care of stopping the related services and interfaces, depending on config and usage this causes memory to be consumed without ever freed.

To disable radios you have to use.

First radio:

startservice radio_off_0 -f
startservice radio_off_0 -f

Second radio:

startservice radio_off_1 -f
startservice radio_off_1 -f

_________________
KONG PB's: http://www.desipro.de/ddwrt/
KONG Info: http://tips.desipro.de/
unixpunk
DD-WRT Novice


Joined: 07 Jun 2018
Posts: 13

PostPosted: Wed Jun 13, 2018 14:22    Post subject: Reply with quote
<Kong> wrote:
Do not use wl radio on/off, radio off does not take care of stopping the related services and interfaces, depending on config and usage this causes memory to be consumed without ever freed.

To disable radios you have to use.

First radio:

startservice radio_off_0 -f
startservice radio_off_0 -f

Second radio:

startservice radio_off_1 -f
startservice radio_off_1 -f


Thanks for the advice! I assume your duplicate of startservice was accidental in that one would be stopservice, assuming no need to run it twice. I will get this tested soon and report back. [edit] Or maybe I'm wrong and i need to replace off with on to do the opposite, will test here either way.

I'll look around for a feature request section to post in, but if radio scheduling supported a grid/matrix with no only hours of the day, but also days of the week, this would eliminate my need to use cron for this. Smile Thanks for all your work on this! (BrainSlayer too!) Been running dd-wrt for like a decade on WRT54G's...just now upgrading since it can't keep up with my internet speed anymore. Smile
Goto page Previous  1, 2, 3, 4, 5, 6  Next Display posts from previous:    Page 3 of 6
Post new topic   Reply to topic    DD-WRT Forum Index -> Broadcom SoC based Hardware All times are GMT

Navigation

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You can attach files in this forum
You can download files in this forum