Posted: Sun Apr 26, 2020 19:01 Post subject: R8000 - Kernel panic: Oops - BUG: 80000007 [#1] SMP ARM
Dear DD-WRT Community,
I'm using DD-WRT for 12 years now and most of the time it works well (Ok, there are bugs - but where aren't? - it's beta ). Thanks to the project maintainers and the community!
Since I've shifted from KONG builds to BS builds (I'm not trying to offend anybody - just trying to pinpoint the reasons) the Netgear R8000 is not running as stable as before.
Previous builds:
- a lot...
- dd-wrt.v24-K3_AC_ARM_R8000_2019-07-11_40270M
-> shift to BS builds
- netgear-r8000-webflash_2020-02-14_r42336 (reboots)
- netgear-r8000-webflash_2020-04-06_r42847 (reboots)
-> complete reset - back to stock - back to dd-wrt
- netgear-r8000-webflash_2020-04-20-r42954 (reboots)
I've also experienced the channel selection problems as mentioned here and I applied the suggested workaround:
Hi, I have 2 R8000's and have noticed the frequent reboots same as you ever since I switched from Kongs last build. My routers generally reboot after about 6-12 hours on average. I was trying to figure out why and set up syslog to send to the papertrail website and noticed that prior to some reboots I would get an openvpn error that my router could not connect to the VPN (my routers are in different cities and they VPN to each other). It made me think that my router somehow lost it's connection to the internet before a reboot.
I have the WDS Watchdog enabled until Administration->Keep Alive and was thinking maybe my router losing internet connectivity was causing the WDS Watchdog to force a reboot (despite nothing in the log stating this) so I disabled it this morning and so far my router has been up 11 hours.
Another thing I noticed on both routers is a very high Active IP Connections on the Status->Router page. Like around 4,000-5,000 active IP connections all the time which is a very high number! Does your have an unusually high Active IP Connection count? I decided to change a few things to try and fix it. I went to Administration-> Management under IP Filter Settings and made it: Westwood, Max Ports 4096, TCP Timeout 600, UDP Timeout 90 and now my active IP connections is down to 50-70 on each router.
Not sure if the changes I made above will fix the problem, but so far my router is running all right.
I don't know how to access the kernel logs. If you tell me, I'll see if mine is having the same errors that yours had. I turned on klogd under Services->Services, but I don't see my kernel log on papertrail so not sure how to access it.
Hope we find a solution to our rebooting routers soon!
I have the WDS Watchdog enabled until Administration->Keep Alive and was thinking maybe my router losing internet connectivity was causing the WDS Watchdog to force a reboot (despite nothing in the log stating this) so I disabled it this morning and so far my router has been up 11 hours.
I'm not using WDS Watchdog.
buffpatel wrote:
Another thing I noticed on both routers is a very high Active IP Connections on the Status->Router page. Like around 4,000-5,000 active IP connections all the time which is a very high number! Does your have an unusually high Active IP Connection count?
No, the average IP connections count is 100-200.
buffpatel wrote:
I decided to change a few things to try and fix it. I went to Administration-> Management under IP Filter Settings and made it: Westwood, Max Ports 4096, TCP Timeout 600, UDP Timeout 90 and now my active IP connections is down to 50-70 on each router.
Not sure if the changes I made above will fix the problem, but so far my router is running all right.
Sounds good - but doesn't seem to be related to the kernel panic issue of my R8000.
buffpatel wrote:
I don't know how to access the kernel logs. If you tell me, I'll see if mine is having the same errors that yours had. I turned on klogd under Services->Services, but I don't see my kernel log on papertrail so not sure how to access it.
That's how I have activated the kernel log. I'm not using papertrail but a rsyslog server https://www.rsyslog.com/ on my LAN.
buffpatel wrote:
Hope we find a solution to our rebooting routers soon!
I hope so, too.
The router is currently running for 1 day and 6 minutes. Let's see, if and when it will reboot again. _________________ Qualcom/Atheros:
NETGEAR R7800 DD-WRT v3.0-r53045 std (06/22/23)
Hi, any updates on your R8000 and the reboots? Mine continues to reboot, but not as frequently as before. Previously I'd get 3-4 reboots per day, now my routers will go 1-2 days on average before a reboot which is an improvement but still not quite what I'm wanting.
My router now consistently has 100-200 IP connections whenever I check and the load average (in top right corner of screen) is around 0.01 - 0.4 (which means the CPU isn't being overworked in general)
Are you using Unbound, SmartDNS, DNSCrypt, or DNSSEC? I don't have any specific proof, but I feel like these things may have contributed to some of my reboots reboots. I actually turned all of them off in the past few days. In fact, on the newest build, 4/29/10 43028, I think Unbound is broken. Whenever I would enable Unbound my router would no longer be able to access the internet! Might be something to check on your router.
Out of curiosity, how are you keeping track of your reboots? What I did was I added the command:
logger ROUTER REBOOT
into the Administration->Commands Startup box. What it does is it literally puts the words "ROUTER REBOOT" into my syslogs as one of the first things the router does when it starts and now whenever I review my logs on the papertrail website, I just do a search for "reboot" and I can count how many times it has rebooted.
I still haven't seen many error messages in my logs. I looked into the Rsyslog, but it appears to be a windows product and requires a computer to be on 24/7 to log things. My problem is I use a laptop and it sleeps at night so not sure I can monitor the kernal logs that way. My router has a USB drive mounted and shared with Samba on my network - I wonder if I can have the router just copy the log files onto my usb drive? Might look into that now.
Hi, any updates on your R8000 and the reboots? Mine continues to reboot, but not as frequently as before. Previously I'd get 3-4 reboots per day, now my routers will go 1-2 days on average before a reboot which is an improvement but still not quite what I'm wanting.
Hi after approximately 4 days (flashed DD-WRT v3.0-r43028 std (04/29/20) - no reset) the R8000 rebooted again. So it is less frequent than before.
buffpatel wrote:
Are you using Unbound, SmartDNS, DNSCrypt, or DNSSEC?
This is the status of the services you mentioned:
- Recursive DNS Resolving (Unbound) OFF
- SmartDNS Resolver OFF
- Use DNSMasq for DNS ON
- DNSSEC ON
- No DNS Rebind ON
- Query DNS in Strict Order ON
buffpatel wrote:
Out of curiosity, how are you keeping track of your reboots?
This is what I do, to detect reboots:
- I search my log for "kernel: Kernel panic - not syncing: Fatal exception "
- I switch off the LEDs - when the LEDs are back on - a reboot has happened
buffpatel wrote:
My router has a USB drive mounted and shared with Samba on my network - I wonder if I can have the router just copy the log files onto my usb drive? Might look into that now.
Thank you kernel-panic69.
I was actually installing r43055 on the R8000 while you've posted the reply.
I was waiting for another reboot and it happened tonight. I have power reset the R8000 yesterday at 12:05 h before it crashed at 23:43 h.
So it had about 11:40 h uptime.
Nevertheless, this crash seems to be unrelated to the previous one, as the log shows no crash info this time.
And I changed a setting: I've activated unbound reverse DNS:
Code:
<4>1 2020-05-07T21:30:02+02:00 router-2 kernel - kernel: dhd_flow_rings_delete_for_peer: ifindex 0
<1>1 2020-01-01T00:00:17+02:00 router-2 kernel - kernel: fast-classifier (PBR safe v2.1.6b): starting up
<1>1 2020-01-01T00:00:17+02:00 router-2 kernel - kernel: fast-classifier: registered
<26>1 2020-01-01T00:00:17+02:00 router-2 dnsmasq 887 dnsmasq[887]: failed to create listening socket for port 53:00:00 Socket type not supported
<26>1 2020-01-01T00:00:17+02:00 router-2 dnsmasq 887 dnsmasq[887]: FAILED to start up
<28>1 2020-01-01T01:00:19+02:00 router-2 dnsmasq 1150 dnsmasq[1150]: warning: ignoring resolv-file flag because no-resolv is set
<4>1 2020-01-01T00:00:20+02:00 router-2 kernel - kernel: dhd_flow_rings_delete_for_peer: ifindex 0
<4>1 2020-01-01T00:00:20+02:00 router-2 kernel - kernel: dhd_flow_rings_delete_for_peer: ifindex 0
<4>1 2020-01-01T00:00:20+02:00 router-2 kernel - kernel: dhd_flow_rings_delete_for_peer: ifindex 0
<4>1 2020-05-07T21:42:43+02:00 router-2 kernel - kernel: dhd_flow_rings_delete_for_peer: ifindex 0
<4>1 2020-05-07T21:42:43+02:00 router-2 kernel - kernel: dhd_flow_rings_delete_for_peer: ifindex 0
Hi, how's your R8000 doing? Mine is good. I'm currently running the latest build on my router: r43078 std (05/07/20) and it looks like klogd is now actually sending my kernel logs to the papertrail website so hopefully I'll be able to catch errors before my router reboots next time.
One interesting thing I'm seeing is my kernel log is getting flooded with this skip entry message:
Code:
May 08 12:13:27 R8000 kernel CONSOLE: 014404.138 wl2: wlc_bmac_processpmq: skip entry with mc/bc address 05:b4:29:fe:f9:37
May 08 12:13:27 R8000 kernel CONSOLE: 014404.140 wl2: wlc_bmac_processpmq: skip entry with mc/bc address 05:b4:29:fe:f9:37
May 08 12:13:27 R8000 kernel CONSOLE: 014404.155 wl2: wlc_bmac_processpmq: skip entry with mc/bc address 05:b4:29:fe:f9:37
May 08 12:13:27 R8000 kernel CONSOLE: 014404.349 wl2: wlc_bmac_processpmq: skip entry with mc/bc address 05:b4:29:fe:f9:37
May 08 12:13:27 R8000 kernel CONSOLE: 014404.407 wl2: wlc_bmac_processpmq: skip entry with mc/bc address 05:b4:29:fe:f9:37
May 08 12:13:27 R8000 kernel CONSOLE: 014404.435 wl2: wlc_bmac_processpmq: skip entry with mc/bc address 05:b4:29:fe:f9:37
May 08 12:13:27 R8000 kernel CONSOLE: 014404.439 wl2: wlc_bmac_processpmq: skip entry with mc/bc address 05:b4:29:fe:f9:37
May 08 12:13:27 R8000 kernel CONSOLE: 014404.553 wl2: wlc_bmac_processpmq: skip entry with mc/bc address 05:b4:29:fe:f9:37
May 08 12:13:27 R8000 kernel CONSOLE: 014404.756 wl2: wlc_bmac_processpmq: skip entry with mc/bc address 05:b4:29:fe:f9:37
May 08 12:13:27 R8000 kernel CONSOLE: 014404.882 wl2: wlc_bmac_processpmq: skip entry with mc/bc address 05:b4:29:fe:f9:37
May 08 12:13:27 R8000 kernel CONSOLE: 014404.968 wl2: wlc_bmac_processpmq: skip entry with mc/bc address 05:b4:29:fe:f9:37
May 08 12:13:28 R8000 kernel CONSOLE: 014405.164 wl2: wlc_bmac_processpmq: skip entry with mc/bc address 05:b4:29:fe:f9:37
May 08 12:13:28 R8000 kernel CONSOLE: 014405.267 wl2: wlc_bmac_processpmq: skip entry with mc/bc address 05:b4:29:fe:f9:37
May 08 12:13:28 R8000 kernel CONSOLE: 014405.277 wl2: wlc_bmac_processpmq: skip entry with mc/bc address 05:b4:29:fe:f9:37
May 08 12:13:28 R8000 kernel CONSOLE: 014405.300 wl2: wlc_bmac_processpmq: skip entry with mc/bc address 05:b4:29:fe:f9:37
May 08 12:13:28 R8000 kernel CONSOLE: 014405.318 wl2: wlc_bmac_processpmq: skip entry with mc/bc address 05:b4:29:fe:f9:37
May 08 12:13:28 R8000 kernel CONSOLE: 014405.319 wl2: wlc_bmac_processpmq: skip entry with mc/bc address 05:b4:29:fe:f9:37
May 08 12:13:28 R8000 kernel CONSOLE: 014405.338 wl2: wlc_bmac_processpmq: skip entry with mc/bc address 05:b4:29:fe:f9:37
May 08 12:13:28 R8000 kernel CONSOLE: 014405.552 wl2: wlc_bmac_processpmq: skip entry with mc/bc address 05:b4:29:fe:f9:37
May 08 12:13:28 R8000 kernel CONSOLE: 014405.653 wl2: wlc_bmac_processpmq: skip entry with mc/bc address 05:b4:29:fe:f9:37
May 08 12:13:28 R8000 kernel CONSOLE: 014405.662 wl2: wlc_bmac_processpmq: skip entry with mc/bc address 05:b4:29:fe:f9:37
May 08 12:13:28 R8000 kernel CONSOLE: 014405.741 wl2: wlc_bmac_processpmq: skip entry with mc/bc address 05:b4:29:fe:f9:37
May 08 12:13:28 R8000 kernel CONSOLE: 014405.849 wl2: wlc_bmac_processpmq: skip entry with mc/bc address 05:b4:29:fe:f9:37
May 08 12:13:28 R8000 kernel CONSOLE: 014405.850 wl2: wlc_bmac_processpmq: skip entry with mc/bc address 05:b4:29:fe:f9:37
May 08 12:13:28 R8000 kernel CONSOLE: 014406.048 wl2: wlc_bmac_processpmq: skip entry with mc/bc address 05:b4:29:fe:f9:37
May 08 12:13:29 R8000 kernel CONSOLE: 014406.209 wl2: wlc_bmac_processpmq: skip entry with mc/bc address 05:b4:29:fe:f9:37
May 08 12:13:29 R8000 kernel CONSOLE: 014406.210 wl2: wlc_bmac_processpmq: skip entry with mc/bc address 05:b4:29:fe:f9:37
May 08 12:13:29 R8000 kernel CONSOLE: 014406.272 wl2: wlc_bmac_processpmq: skip entry with mc/bc address 05:b4:29:fe:f9:37
May 08 12:13:29 R8000 kernel CONSOLE: 014406.605 wl2: wlc_bmac_processpmq: skip entry with mc/bc address 05:b4:29:fe:f9:37
May 08 12:13:29 R8000 kernel CONSOLE: 014406.747 wl2: wlc_bmac_processpmq: skip entry with mc/bc address 05:b4:29:fe:f9:37
May 08 12:13:29 R8000 kernel CONSOLE: 014406.820 wl2: wlc_bmac_processpmq: skip entry with mc/bc address 05:b4:29:fe:f9:37
May 08 12:13:29 R8000 kernel CONSOLE: 014406.846 wl2: wlc_bmac_processpmq: skip entry with mc/bc address 05:b4:29:fe:f9:37
May 08 12:13:29 R8000 kernel CONSOLE: 014406.883 wl2: wlc_bmac_processpmq: skip entry with mc/bc address 05:b4:29:fe:f9:37
May 08 12:13:29 R8000 kernel CONSOLE: 014406.886 wl2: wlc_bmac_processpmq: skip entry with mc/bc address 05:b4:29:fe:f9:37
Do you see any strange messages like that in your logs?
What's interesting is that I don't have any devices on my network with that particular mac address. I do have a tablet with the mac address: 04:B4:29:FE:F9:37 - so the first hex value is off by 1. I googled this error and some people have seen this message, but doesn't look like anyone found a fix for it. I'm wondering if this error might be related to the reboots?
As for my routers uptime, I've found that the build from 4/29/19 was the most stable of these newer builds in the past month - I had my router once go 6 days without a reboot on that build. This current build so far has lasted 4.5 hours, but we'll see.
Overall, I think the most stable build for the R8000 is still Kong's last build from 7/10/19 - r40270. With that build I routinely saw uptimes of 20-30 days. I'll keep testing these new builds for the next few weeks, but unless I start seeing 7-12 days uptime, I think I'll flash my router back to Kong's build. If you're interested you can find Kong's last build here:
Posted: Sat May 09, 2020 6:41 Post subject: Kernel panic - r43078
Hi there,
tonight the R8000 crashed again having r43078 installed - but this time it didn't recover (orange LED). I had to power reset the router to get it going again.
See the attachment for this particular event.
messages_2020_05_09-R8000_kernel_panic.txt
Description:
First Line: is the message prior to event Last Line: first timed message after power reset (as it didn't recover)
Joined: 06 Jun 2006 Posts: 7463 Location: Dresden, Germany
Posted: Sat May 09, 2020 10:09 Post subject: Re: R8000 - Kernel panic: Oops - BUG: 80000007 [#1] SMP ARM
basofree wrote:
Dear DD-WRT Community,
I'm using DD-WRT for 12 years now and most of the time it works well (Ok, there are bugs - but where aren't? - it's beta ). Thanks to the project maintainers and the community!
Since I've shifted from KONG builds to BS builds (I'm not trying to offend anybody - just trying to pinpoint the reasons) the Netgear R8000 is not running as stable as before.
Previous builds:
- a lot...
- dd-wrt.v24-K3_AC_ARM_R8000_2019-07-11_40270M
-> shift to BS builds
- netgear-r8000-webflash_2020-02-14_r42336 (reboots)
- netgear-r8000-webflash_2020-04-06_r42847 (reboots)
-> complete reset - back to stock - back to dd-wrt
- netgear-r8000-webflash_2020-04-20-r42954 (reboots)
I've also experienced the channel selection problems as mentioned here and I applied the suggested workaround:
It is just a guess, but I had the feeling, that the reboots happen when I go off or come back with my smartphone (wild guess).
I'm currently using the R8000 in following ways:
- WAN-Router (DHCP)
- WLAN AP
- scripted OpenVPN-client setup at startup (using ebtable_filter for filtering DHCP requests from/to other sites)
- Wireguard-server
Do you have any ideas or suggestions?
this can be related to your bridged tap interface. but one hint. since you are using a router with the dhd driver, try the experimental build. it uses the mac80211 based brcmfmac and is much better for routers which are based on these soc wifi chipsets.
so if the router has only 43602 or 4365 etc. based wifi chipsets, use the experimental build. (or try it at least) _________________ "So you tried to use the computer and it started smoking? Sounds like a Mac to me.." - Louis Rossmann https://www.youtube.com/watch?v=eL_5YDRWqGE&t=60s
Posted: Sat May 09, 2020 13:00 Post subject: Re: R8000 - Kernel panic: Oops - BUG: 80000007 [#1] SMP ARM
BrainSlayer wrote:
[...]
this can be related to your bridged tap interface. but one hint. since you are using a router with the dhd driver, try the experimental build. it uses the mac80211 based brcmfmac and is much better for routers which are based on these soc wifi chipsets.
so if the router has only 43602 or 4365 etc. based wifi chipsets, use the experimental build. (or try it at least)
Thanks BrainSlayer for the hint. Yep, I'm using bridged tap for OpenVPN connections - I know that it has caveats - but it is working 'fine' on the other routers of mine.
I didn't realize, I'm using dhd instead of brcmfmac. I've now seen other R8000 owners utilizing it [EDIT: That was a pulled build - maybe that's why] for BCM43602
Where do I get the experimental versions from, utilizing brcmfmac (still searching - EDIT: Found experimental builds in subfolders of other devices)? I know how to recover the R8000 using TFTP, so I'm confident giving it a try. _________________ Qualcom/Atheros:
NETGEAR R7800 DD-WRT v3.0-r53045 std (06/22/23)
Joined: 06 Jun 2006 Posts: 7463 Location: Dresden, Germany
Posted: Sat May 09, 2020 15:18 Post subject:
kernel-panic69 wrote:
From his email reply to me about this, since I asked him where the files were for the R8000:
Quote:
its a subfolder in each of the supported devices
i did not include yet the r8000. i'm not sure if all 3 wifi interfaces are using the dhd driver
Not sure if that means that it will be in the next release or not, but there is your answer.
the next release will contain a version for r8000 too. of course untested. but since these devices are easy to debrick its safe to give it a try. i also changed some compile settings which may have caused the dhd driver crashes unsure. since this driver is unchanged since a long time, i expect it can only be a compiler problem. on the other hand broadcom has a poor driver code quality and doesnt provide code updates. its hard to find something new for it if there is any trouble. so if brcmfmac works, it would be best since it is easier to maintain for me and it has much better and more features _________________ "So you tried to use the computer and it started smoking? Sounds like a Mac to me.." - Louis Rossmann https://www.youtube.com/watch?v=eL_5YDRWqGE&t=60s
Quick update, I'm running the 43099 build from 5/9/20 with the experimental driver on both R8000's and so far so good! Both have been running for 1.5 days without any issues or reboots.
I have noticed that my active IP connections are in the 250-400 range on both routers and that the load average is staying consistently in the 0.25-0.40 range which is higher than in previous builds, but I guess I'll take it if it means less reboots.
In terms of router logs, I've seen some strange messages many of which I don't understand repeated over and over, Here are a few:
May 12 09:18:06 R8000 kernel brcmfmac: brcmf_cfg80211_dump_survey: cca_get_stats failed (-52)
May 12 09:24:54 R8000 hostapd ath0: STA 04:b4:29:fe:f9:37 WPA: received EAPOL-Key with invalid MIC
May 12 09:24:55 R8000 hostapd ath0: STA 04:b4:29:fe:f9:37 WPA: group key handshake failed (RSN) after 4 tries
May 12 09:24:55 R8000 hostapd ath0: STA 04:1e:64:f0:65:bb IEEE 802.11: disassociated
Despite my logs being flooded with these messages, overall this build seems to be working.