Can you reproduce this after a hard reset (nvram erase && reboot) and only manipulating the SFE / CTF configuration and nothing else after logging in?
yup
Here is what I think is happening. One of the native processes (see list below) is writing beyond its allocated memory, perhaps a malloc +1 bug.
When you switch SFE/CTF it restarts interfaces, restarts processes etc.. each one has to re-malloc new memory space. And it does. Writing over memory that the other process is trying to use. When that one process tries to access that spot it gets misaligned memory error and kernel crashes.
So SFE/CTF is just a quick way to restart a bunch of processes and call a bunch of mallocs. One of them collides with bad memory block.
Now logically, because every time it seems to be a different process, swapper, awk, date etc that is causing misaligned access, I would think it is one of the processes that gets restarted during SFE/CTF switch that is allocating new memory that encroaches into already allocated space for another process.
This begs the question, what happened after 4 days that caused the random crash? Interfaces wouldnt restart on their own. Perhaps answer lies in a process that periodically restarts (or runs) and has to re-malloc. Wonder what venn diagram is of processes that periodically start up to do a task and processes that restart on CTF/SFE switch.
Joined: 01 Dec 2021 Posts: 289 Location: Maryland, United States
Posted: Sat Jun 18, 2022 4:40 Post subject:
Duxa wrote:
@PaulGo, are you experiencing GUI loss on r7000p?
None whatsoever. The processors in the R7000P is much faster and can handle whatever changes I make without loss of GUI. If I remember correctly, certain changes may cause the router to reboot. But even the reboot is very quick. For now, I went back to the Netgear firmware for testing an upcoming firmware update which will allow Netgear firmware to be compatible with Comcast's vCMTS using IPv6. The R7000P with DD-WRT firmware using the additions which I used with the R6300 allows full compatibility with Comcast's IPv6 vCMTS.
Joined: 31 Jul 2021 Posts: 2146 Location: All over YOUR webs
Posted: Sat Jun 18, 2022 9:11 Post subject:
Many processes restart on NTP update as well, firewall is one and then every other service that needs time sync via cron or directly. Look at the NTP client source code for instance.
None of that top / ps output looks like a default configuration after an nvram erase && reboot and logging in after only changing the password. _________________ "The woods are lovely, dark and deep,
But I have promises to keep,
And miles to go before I sleep,
And miles to go before I sleep." - Robert Frost
"I am one of the noticeable ones - notice me" - Dale Frances McKenzie Bozzio
None of that top / ps output looks like a default configuration after an nvram erase && reboot and logging in after only changing the password.
Of course not. Thats my standard config mentioned in the first post. After I did factory clean/reset and reproduced the issue, confirming it wasnt my config, I went back to my config.
I cant be running the router with no wifi, settings I need on my network for 4+ days until it fails again.
There is nothing crazy in this config. zero firewall rules, basically set up wifi, set DNS ip, set up dynamic dns, enable ssh access, set up VPN. Thats it.
Administration -> Factory Defaults is not entirely the same exact thing as doing an nvram erase && reboot or an mtd -r erase nvram && reboot (the latter being a high-risk maneuver that can result in a bricked device). That function only restores default configurations via resetbutton functionality. The other two wipe the partition completely as best I understand it.
EDIT: the rest of the discussion was split and moved here. _________________ "The woods are lovely, dark and deep,
But I have promises to keep,
And miles to go before I sleep,
And miles to go before I sleep." - Robert Frost
"I am one of the noticeable ones - notice me" - Dale Frances McKenzie Bozzio
Well, i got some bad news, and then some more bad news.
I updated to build 06-19-2022-r49268, issue occurred after around 36 hours of uptime. Thats first bad news.
Second bad news, sometime during that 36 hours my serial output stopped, and now I cant get anything out of the serial on any of the builds. I think the port is dead. If anyone has any ideas please let me know. When I multimeter the pins I do get 3.3v still vs ground.
I think I need to test this with setting to SFE (default setting that most probably never switch from).
I set it to SFE and will see if issue happen again. Will update sometime within next 4 days. If no issue for 4 days then that would mean CTF is no good on r6300v1 (unstable) and should not be used.
Will report back.
Would like to hear if BS or anyone else looked at the code for this issue?
I have used CTF for many days on my R6300v1 without problems.
Happen to know which build?
Also can you give the following info:
- Are both 2.4Ghz and 5.0Ghz radios setup and ON?
- Router providing both DNS and DHCP?
- Do you use dynamicdns?
- Is upnp off?
- Did you manually forward any ports?
- Is SSH enabled?
- Is logging enabled? If so, both syslog and klog? Or just one of them?
That info will give me more things to experiment with.
Joined: 01 Dec 2021 Posts: 289 Location: Maryland, United States
Posted: Sun Jun 19, 2022 20:43 Post subject:
Happen to know which build? It was a recent build.
Also can you give the following info:
- Are both 2.4Ghz and 5.0Ghz radios setup and ON? Only 5 GHZ
- Router providing both DNS and DHCP? Yes
- Do you use dynamicdns? No
- Is upnp off? On
- Did you manually forward any ports? No
- Is SSH enabled? No
- Is logging enabled? No
CTF is a blob that is provided by Broadcom. I don't believe it's implementation has changed.
CTF is a blob that is provided by Broadcom. I don't believe it's implementation has changed.
If this is true, then it is the blob for Linux 2.6.36.4 kernel; for some reason, I *kinda* doubt that it is.
Duxa wrote:
yeah, but it most likely relies on libraries in the kernel, which if updated may introduce incompatibilities.
Like the whole ipv6 (legacy) error is likely due to ipv6 library not being updated but kernel being updated. So its making deprecated calls.
See above. I'm also thinking that ipv6 is kernel version dependent. But enough speculation, conjecture, and possible misinformation. Proof carries more weight. _________________ "The woods are lovely, dark and deep,
But I have promises to keep,
And miles to go before I sleep,
And miles to go before I sleep." - Robert Frost
"I am one of the noticeable ones - notice me" - Dale Frances McKenzie Bozzio