re: error `Unhandled kernel unaligned access` may be helped by recompiling kernel, so latest build if it has the same issue I already ponged BS. FW upgrade will show.
I have already asked one of our community members with same router exactly to chime in and try to duplicate your issue and provide his feedback, he also fondles with SFE/CTF
re: error `Unhandled kernel unaligned access` may be helped by recompiling kernel so I will also pong BS.
I have already asked one of our community members with same router exactly to chime in and try to duplicate your issue and provide his feedback.
Thats it for now.
If its PaulGo I already reached out to him, he updated his router to r7000p, so he wont be able to do it. If its someone else, then thats great.
I went back as far as dd-wrt.v24-46788_NEWD-2_K3.x_giga-R6300 build from May 28 2021. Just to see if this is a new issue. Still happens there. So its not. Unless BS specifically tried to fix it I doubt its fixed in a build that is 2 days newer.
This should really be confirmed to be or not to be an issue on other routers. If its a malloc issue I expect it to be model independent (unless its not in common code).
This should be verified first. I unfortunately do not have another router capable of ddwrt right now. But I should have access to one next week. Ill test it there. Would be nice if someone else could test too.
CTF, wait 1 min
SFE, wait 1 min
repeat 8 more times... see if router rebooted
I am currently willing to dedicate time and effort to trying new builds and potential fixes. So would be nice if BS could take a look at the issue.
PS - Ill update to whatever latest build is once I reproduce it on 6/14 build, its already a day into the test, I dont want to restart from scratch by updating.
Joined: 18 Mar 2014 Posts: 12877 Location: Netherlands
Posted: Fri Jun 17, 2022 9:28 Post subject:
A kernel panic of course always warrants further investigation, so the head honcho has been notified
Are these actually two problems as in your first post you mention a spontaneous reboot after some time?
I noticed you had the 5 GHz radio off, in the past we had a problem that spontaneous reboots occurred if you had the 5 GHz radio off.
This was solved but perhaps we have a regression?
About CTF, if you enable it the first time the router always reboots.
I have an R6400v1 with serial attached I have standard no Flow Acceleration enabled so I setup :
Shortcut forwarding Engine: CTF
Flow Acceleration: CTF+FA
after one minute switched to
Shortcut forwarding Engine: SFE
Flow Acceleration: disable
The router then reboots, no big deal as it is only once, after that the router is stable.
Yes it reboots with an OOM but as said no big deal as it is only once and should not be related to a spontaneous reboot.
Joined: 31 Jul 2021 Posts: 2146 Location: All over YOUR webs
Posted: Fri Jun 17, 2022 12:54 Post subject:
Duxa wrote:
If its PaulGo I already reached out to him, he updated his router to r7000p, so he wont be able to do it. If its someone else, then thats great.
Yes it was, and he's not able because thats his choice, cant force people to volunteer their time. I sort of disagree with PaulGo re: the HW failure part, due to the kernel message printed on your logs and per my investigation into it.
A kernel panic of course always warrants further investigation, so the head honcho has been notified
Are these actually two problems as in your first post you mention a spontaneous reboot after some time?
I noticed you had the 5 GHz radio off, in the past we had a problem that spontaneous reboots occurred if you had the 5 GHz radio off.
This was solved but perhaps we have a regression?
About CTF, if you enable it the first time the router always reboots.
I have an R6400v1 with serial attached I have standard no Flow Acceleration enabled so I setup :
Shortcut forwarding Engine: CTF
Flow Acceleration: CTF+FA
after one minute switched to
Shortcut forwarding Engine: SFE
Flow Acceleration: disable
The router then reboots, no big deal as it is only once, after that the router is stable.
Yes it reboots with an OOM but as said no big deal as it is only once and should not be related to a spontaneous reboot.
Will attach the serial log
Thanks for trying it out. To answer your questions:
Initially issue happened as a spontaneous reboot after router was up for 4 days. Just randomly lost internet, few minutes later router came up and internet returned. When this happened the router has not been logged into for 3days+ and was just doing its thing (so nothing was done to trigger it, other than clients using the internet through it). This was on 06-07-2022-r49113 build.
I then updated to 06-14-2022-r49197 build. Again leaving the unit to do its thing. About 18 hours later, lost internet again. This time it did not come back up on its own. I gave it about an hour and ended up power cycling. Which resulted in it coming back up. I had no serial so I dont know what was happening on it, but since hooking up serial I did notice that sometimes if I do a factory clean after a reboot it wont come up until I power cycle, it seems to be stuck on wl driver error.
Quote:
No such device
get_wl_instance doesnt return the right value 0
No such device
get_wl_instance doesnt return the right value 0
No such device
get_wl_instance doesnt return the right value 0
No such device
get_wl_instance doesnt return the right value 0
No such device
Radio: 0 currently turned off
[httpd] : daemon successfully stopped
[httpd] : httpd server shutdown[httpd] : httpd server started at port 80
[httpd] : http daemon successfully started
[resetbutton] : daemon successfully stopped
[resetbutton] : resetbutton daemon successfully started
[USB] checking...
/opt/etc/init.d/rcS: No such file or directory
/jffs/etc/init.d/rcS: No such file or directory
/mmc/etc/init.d/rcS: No such file or directory
killall: proxywatchdog.sh: no process killed
/mnt/smbshare
umount: can't unmount /mnt/smbshare: No such file or directory
rmmod: cifs: No such file or directory
rmmod: fscache: No such file or directory
killall: schedulerb.sh: no process killed
killall: wdswatchdog.sh: no process killed
get_wl_instance doesnt return the right value 0
No such device
get_wl_instance doesnt return the right value 0
No such device
get_wl_instance doesnt return the right value 0
No such device
get_wl_instance doesnt return the right value 0
No such device
get_wl_instance doesnt return the right value 0
No such device
Anyways. after these 2 different versions spontaneously rebooting after being up for some time I decided to see if I can do something to induce the reboot. This is where CTF to SFE swapping back and forth strategy came up. There is a chance its unrelated, I wont know until my current setup spontaneously reboots (as I am now capturing serial output from it 24/7). But either way this should be resolved.
The behavior is definitely different for you than it is for me. on r6300v1 changing SFE to CTF or CTF to SFE does not trigger a reboot. It triggers interface restarts (in my previously attached log you can see this). No actual router reboots. If reboot is supposed to happen then perhaps thats a clue which part of the code is broken for this router. Or it could be that the CTF+FA triggers it (since hw is involved for FA), r6300v1 doesnt have FA. only CTF and SFE. But issue happens when switching to either one, so its not like its CTF issue. I have a feeling issue is unrelated to CTF/SFE its just memory access by those makes it easy to trigger the condition.
There is a good chance that something is writing outside of its malloc, under normal conditions it may take days for that memory location to be accessed or written to, while SFE/CTF switching, since it triggers so much stuff to happen (with interface restarts etc) that it simply triggers the issue faster. Its a good thing we have such a fast and easy way to reproduce, fixes can be tested quickly.
If its PaulGo I already reached out to him, he updated his router to r7000p, so he wont be able to do it. If its someone else, then thats great.
Yes it was, and he's not able because thats his choice, cant force people to volunteer their time. I sort of disagree with PaulGo re: the HW failure part, due to the kernel message printed on your logs and per my investigation into it.
So newer build only to test not older and if recompiling kernel helps thats UP to BS. Hes been ponged. Could be HW failure too, you were adamant it wasnt on your OP post anyway. we'll see.
HW failure is unlikely, but definitely possible. The router is 10 years old at this point, and has been on pretty much 24/7 for those entire 10 years. BUT I am skeptical that it is hw failure since there were no issues like this when running Tomato FW on it up until a week ago when I switched to ddwrt. But yes, anything is possible.
If it is HW, then it would have to be RAM that is failing based on unaligned access errors.
I ran ddwrt on this router 2012 -> 2017ish, at which point I switched to Tomato to get faster WAN speed (ddwrt did not have SFE/CTF yet). Now switching back to ddwrt for the same reason, its SFE/CTF implementation is better than Tomato's
Joined: 31 Jul 2021 Posts: 2146 Location: All over YOUR webs
Posted: Fri Jun 17, 2022 18:51 Post subject:
Well, idk why FT/OpenWRT would have a different implementation of CTF, because CTF or CFT & FA are all closed source and thus only available in binary blobs, if they use the same binary blobs there isn't no difference on that side.
I dont know how OpenWRT or FT handle their interfaces, namely bridges.
DD-WRT by default bridges LAN/WLAN/etc.. under br0 (the default) bridge and CFT or CTF & FA are only recommended for ISP speeds above 100Mbps as it doesn't have much influence otherwise if at all on IPv4 NAT'ed WAN side traffic.
Now, When using CTF the bridge performs better and thus any gains are translated towards WiFi by side effect only of how DD-WRT's bridge aggregates said above interfaces. Otherwise CTF/CTF & FA are strictly implemented on WAN<->LAN only, so that may be it, no idea.
Mind the caveats...
caveats wrote:
Cons: NAT Acceleration supports only adaptive QoS (Level 1 CTF), no QoS (Level 2 CTF+FA), may not support port forwarding (hosting game servers, etc.), parental controls, PPPoE, STP. The increased retransmission's caused by NAT Acceleration may also cause shuttering on some streaming devices (Apple TV, Chromecast, VoIP).
Joined: 08 May 2018 Posts: 14210 Location: Texas, USA
Posted: Fri Jun 17, 2022 19:27 Post subject:
If you're quoting the hardware wiki, none of that information that tmittelstaedt (tedm) added was verified as factual by BrainSlayer AFAIK. He had about zero clue when CTF was even introduced upstream by Broadcom until I interjected. He also had zero clue about SFE / fast-classifier. Guess I need to revisit that and seek proper edits within.
Well, idk why FT/OpenWRT would have a different implementation of CTF, because CTF or CFT & FA are all closed source and thus only available in binary blobs, if they use the same binary blobs there isn't no difference on that side.
Thats the point, issue is not SFE or CTF. The issue is with something else writing outside of its malloc. Doing SFE,CTF switching is just a quick way to run into that memory space and trigger kernel panic.
Joined: 16 Jun 2022 Posts: 16 Location: Dallas, TX
Posted: Fri Jun 17, 2022 22:55 Post subject:
Might help to detail exactly how you imaged dd-wrt onto your router. Also from my research prior to flashing my router, the NVRAM needs to be reset before and after initial upgrade. Some routers also have an initial flash binary vs. one used when you're already on dd-wrt.
I call out the NVRAM resets because you mentioned only resetting to factory defaults after installing dd-wrt, and it also sounded like it was done via dd-wrt itself. Most routers have their own hard reset method intended to be used in the before/after procedures.
I was in a similar situation to you in that prior to installing dd-wrt, I was also running a 3rd party firmware. But mine supports a firmware recovery mode, which is the initial path to install dd-wrt, and that recovery mode is native/provided by the vendor and not related to any 3rd party firmware.
The crash dumps are via serial console; it would take something rather extreme to grenade the bootloader
here. Since serial adapter is already hooked up, the next-to-last-resort recovery method is already handy. _________________ "The woods are lovely, dark and deep,
But I have promises to keep,
And miles to go before I sleep,
And miles to go before I sleep." - Robert Frost
"I am one of the noticeable ones - notice me" - Dale Frances McKenzie Bozzio
Joined: 01 Dec 2021 Posts: 289 Location: Maryland, United States
Posted: Sat Jun 18, 2022 2:53 Post subject:
I have changed from SFE to CTF and back again many times. The Joker advised me to be patient (I needed to wait several minutes on my single core R6300v1) when I did not lose GUI or have any crashes. Several days ago, I switched and now use a newly acquired R7000P with Verizon FiOS. So my conclusion is that it is hardware related (perhaps memory) and the newer firmware is accessing something differently (or using more memory) that is causing your router to have problems.
I have changed from SFE to CTF and back again many times. The Joker advised me to be patient (I needed to wait several minutes on my single core R6300v1) when I did not lose GUI or have any crashes. Several days ago, I switched and now use a newly acquired R7000P with Verizon FiOS. So my conclusion is that it is hardware related (perhaps memory) and the newer firmware is accessing something differently (or using more memory) that is causing your router to have problems.
So if you had GUI loss on CTF <-> SFE switch, then the following scenario is possible.
kernel crash -> to the wl driver error on reboot (infinite loop).
Resolved via Power cycle.
That would align with my findings. The unaligned access is not same as running out of memory. You would see a different error (allocation error). So I tend to believe this is a bug.
@PaulGo, are you experiencing GUI loss on r7000p?
Last edited by Duxa on Sat Jun 18, 2022 3:53; edited 1 time in total
Can you reproduce this after a hard reset (nvram erase && reboot) and only manipulating the SFE / CTF configuration and nothing else after logging in? _________________ "The woods are lovely, dark and deep,
But I have promises to keep,
And miles to go before I sleep,
And miles to go before I sleep." - Robert Frost
"I am one of the noticeable ones - notice me" - Dale Frances McKenzie Bozzio