r6300v1 Kernel Panic r49197 giga (06/14/22)

Post new topic   Reply to topic    DD-WRT Forum Index -> Broadcom SoC based Hardware
Goto page Previous  1, 2, 3, 4  Next
Author Message
Duxa
DD-WRT User


Joined: 16 Aug 2013
Posts: 191

PostPosted: Sat Jun 18, 2022 3:52    Post subject: Reply with quote
dale_gribble39 wrote:
Can you reproduce this after a hard reset (nvram erase && reboot) and only manipulating the SFE / CTF configuration and nothing else after logging in?


yup

Here is what I think is happening. One of the native processes (see list below) is writing beyond its allocated memory, perhaps a malloc +1 bug.

When you switch SFE/CTF it restarts interfaces, restarts processes etc.. each one has to re-malloc new memory space. And it does. Writing over memory that the other process is trying to use. When that one process tries to access that spot it gets misaligned memory error and kernel crashes.

So SFE/CTF is just a quick way to restart a bunch of processes and call a bunch of mallocs. One of them collides with bad memory block.

Now logically, because every time it seems to be a different process, swapper, awk, date etc that is causing misaligned access, I would think it is one of the processes that gets restarted during SFE/CTF switch that is allocating new memory that encroaches into already allocated space for another process.

This begs the question, what happened after 4 days that caused the random crash? Interfaces wouldnt restart on their own. Perhaps answer lies in a process that periodically restarts (or runs) and has to re-malloc. Wonder what venn diagram is of processes that periodically start up to do a task and processes that restart on CTF/SFE switch.

Code:
  PID USER       VSZ STAT COMMAND
    1 root      1204 S    /sbin/init
    2 root         0 SW   [kthreadd]
    3 root         0 SW   [ksoftirqd/0]
    5 root         0 SW<  [kworker/0:0H]
    6 root         0 SW   [kworker/u2:0]
    7 root         0 SW<  [netns]
    8 root         0 SW   [kworker/u2:1]
   95 root         0 SW<  [writeback]
   97 root         0 SW<  [crypto]
   98 root         0 SW<  [bioset]
  100 root         0 SW<  [kblockd]
  122 root         0 SW<  [kworker/u3:0]
  150 root         0 SW   [kswapd0]
  151 root         0 SW   [kworker/0:1]
  167 root         0 SW   [fsnotify_mark]
  421 root         0 SW<  [bioset]
  737 root         0 SW<  [bioset]
  738 root         0 SW<  [bioset]
  739 root         0 SW<  [bioset]
  740 root         0 SW<  [bioset]
  741 root         0 SW<  [bioset]
  742 root         0 SW<  [bioset]
  743 root         0 SW<  [bioset]
  744 root         0 SW<  [bioset]
  745 root         0 SW<  [bioset]
  746 root         0 SW<  [bioset]
  747 root         0 SW<  [bioset]
  748 root         0 SW<  [bioset]
  749 root         0 SW<  [bioset]
  750 root         0 SW<  [bioset]
  751 root         0 SW<  [bioset]
  789 root         0 SW<  [bioset]
  794 root         0 SW<  [bioset]
  799 root         0 SW<  [bioset]
  804 root         0 SW<  [bioset]
  809 root         0 SW<  [bioset]
  814 root         0 SW<  [bioset]
  819 root         0 SW<  [bioset]
  836 root         0 SW<  [deferwq]
  838 root         0 SW<  [kworker/0:1H]
 1127 root       968 S    /sbin/hotplug2 --set-rules-file /etc/hotplug2.rules --persistent
 1175 root      1680 S    watchdog
 1402 root      1448 S    telnetd
 1445 root      1448 S    syslogd -Z -L
 1447 root      1448 S    klogd
 1522 root      3256 S    /tmp/openvpnserver --config /tmp/openvpn/openvpn.conf --daemon
 1714 root      1896 S    ttraff
 1921 root      1628 S    wland
 1959 root      1724 S    dnsmasq -u root -g root -C /tmp/dnsmasq.conf
 2203 root      1456 S    inadyn -u redacted -p redacted --input_file /tmp/ddns/inadyn.conf
 2204 root      1448 S    udhcpc -i vlan2 -p /var/run/udhcpc.pid -s /tmp/udhcpc -O routes -O msstaticroutes -O staticroutes
 2702 root      1680 S    process_monitor
 2707 root      1688 S    nas -P /tmp/nas.wl0lan.pid -H 34954 -l br0 -i eth1 -A -m 128 -k redacted -s redacted -w 4 -g 3600
 2718 root      4456 S    httpd -n -p 80
 2720 root      1436 S    resetbutton
 2741 root         0 SW   [kworker/0:2]
 2774 root         0 SW   [scsi_eh_0]
 2775 root         0 SW<  [scsi_tmf_0]
 2776 root         0 SW   [usb-storage]
 2802 root         0 SW<  [bioset]
 2961 root       940 S    cron
 3030 root      1464 S    -sh
 4881 root      1448 R    ps
Sponsor
Duxa
DD-WRT User


Joined: 16 Aug 2013
Posts: 191

PostPosted: Sat Jun 18, 2022 4:11    Post subject: Reply with quote
and here is a list of processes that restart on SFE/CTF switch

Code:
 
5009 root         0 SW   [kworker/0:0]
 5010 root         0 SW   [kworker/0:3]
 5023 root         0 SW   [kworker/u2:2]
 5604 root      1680 S    ttraff
 5764 root      1688 S    nas -P /tmp/nas.wl0lan.pid -H 34954 -l br0 -i eth1 -A -m 128 -k redacted -s redacted -w 4 -g 3600
 6090 root      1628 S    wland
 6137 root      1724 S    dnsmasq -u root -g root -C /tmp/dnsmasq.conf
 6402 root       940 S    cron
 6473 root      4296 S    httpd -n -p 80
 6674 root      1680 S    process_monitor
 6678 root      1456 S    inadyn -u redacted -p redacted  --input_file /tmp/ddns/inadyn.conf
 6679 root      1448 S    udhcpc -i vlan2 -p /var/run/udhcpc.pid -s /tmp/udhcpc -O routes -O msstaticroutes -O staticroutes
PaulGo
DD-WRT User


Joined: 01 Dec 2021
Posts: 289
Location: Maryland, United States

PostPosted: Sat Jun 18, 2022 4:40    Post subject: Reply with quote
Duxa wrote:
@PaulGo, are you experiencing GUI loss on r7000p?

None whatsoever. The processors in the R7000P is much faster and can handle whatever changes I make without loss of GUI. If I remember correctly, certain changes may cause the router to reboot. But even the reboot is very quick. For now, I went back to the Netgear firmware for testing an upcoming firmware update which will allow Netgear firmware to be compatible with Comcast's vCMTS using IPv6. The R7000P with DD-WRT firmware using the additions which I used with the R6300 allows full compatibility with Comcast's IPv6 vCMTS.
the-joker
DD-WRT Developer/Maintainer


Joined: 31 Jul 2021
Posts: 2146
Location: All over YOUR webs

PostPosted: Sat Jun 18, 2022 9:11    Post subject: Reply with quote
Many processes restart on NTP update as well, firewall is one and then every other service that needs time sync via cron or directly. Look at the NTP client source code for instance.

Also when you go around changing settings and click apply, re-triggers restart of services, which firewall/networking is one of such and GUI access maybe down for a few seconds since the network link isn't ready immediately router side and still has the clients on the other end which if there is any latency on network side to realize the DHCP as gone down and then starts transmitting again.

_________________
Saving your retinas from the burn!🔥
DD-WRT Inspired themes for routers
DD-WRT Inspired themes for the phpBB Forum
DD-WRT Inspired themes for the SVN Trac & FTP site
Join in for a chat @ #style_it_themes_public:matrix.org or #style_it_themes:discord

DD-WRT UI Themes Bug Reporting and Discussion thread

Router: ANus RT-AC68U E1 (recognized as C1)


Last edited by the-joker on Sat Jun 18, 2022 17:21; edited 1 time in total
dale_gribble39
DD-WRT Guru


Joined: 11 Jun 2022
Posts: 1899

PostPosted: Sat Jun 18, 2022 15:33    Post subject: Reply with quote
None of that top / ps output looks like a default configuration after an nvram erase && reboot and logging in after only changing the password.
_________________
"The woods are lovely, dark and deep,
But I have promises to keep,
And miles to go before I sleep,
And miles to go before I sleep." - Robert Frost

"I am one of the noticeable ones - notice me" - Dale Frances McKenzie Bozzio

<fact>code knows no gender</fact>

This is me, knowing I've ruffled your feathers, and not giving a ****
Some people are still hard-headed.

--------------------------------------
Mac Pro (Mid 2012) - Two 2.4GHz 6-Core Intel Xeon E5645 processors 64GB 1333MHz DDR3 ECC SDRAM OpenSUSE Leap 15.5
Duxa
DD-WRT User


Joined: 16 Aug 2013
Posts: 191

PostPosted: Sat Jun 18, 2022 18:28    Post subject: Reply with quote
dale_gribble39 wrote:
None of that top / ps output looks like a default configuration after an nvram erase && reboot and logging in after only changing the password.


Of course not. Thats my standard config mentioned in the first post. After I did factory clean/reset and reproduced the issue, confirming it wasnt my config, I went back to my config.

I cant be running the router with no wifi, settings I need on my network for 4+ days until it fails again.

There is nothing crazy in this config. zero firewall rules, basically set up wifi, set DNS ip, set up dynamic dns, enable ssh access, set up VPN. Thats it.
the-joker
DD-WRT Developer/Maintainer


Joined: 31 Jul 2021
Posts: 2146
Location: All over YOUR webs

PostPosted: Sat Jun 18, 2022 19:00    Post subject: Reply with quote
re: DNS, Ive noticed that it works better with at least two, one as a fallback.

recommended are 1.0.0.1 primary and 9.9.9.9 as secondary.

_________________
Saving your retinas from the burn!🔥
DD-WRT Inspired themes for routers
DD-WRT Inspired themes for the phpBB Forum
DD-WRT Inspired themes for the SVN Trac & FTP site
Join in for a chat @ #style_it_themes_public:matrix.org or #style_it_themes:discord

DD-WRT UI Themes Bug Reporting and Discussion thread

Router: ANus RT-AC68U E1 (recognized as C1)
dale_gribble39
DD-WRT Guru


Joined: 11 Jun 2022
Posts: 1899

PostPosted: Sat Jun 18, 2022 19:01    Post subject: Reply with quote
Administration -> Factory Defaults is not entirely the same exact thing as doing an nvram erase && reboot or an mtd -r erase nvram && reboot (the latter being a high-risk maneuver that can result in a bricked device). That function only restores default configurations via resetbutton functionality. The other two wipe the partition completely as best I understand it.

EDIT: the rest of the discussion was split and moved here.

_________________
"The woods are lovely, dark and deep,
But I have promises to keep,
And miles to go before I sleep,
And miles to go before I sleep." - Robert Frost

"I am one of the noticeable ones - notice me" - Dale Frances McKenzie Bozzio

<fact>code knows no gender</fact>

This is me, knowing I've ruffled your feathers, and not giving a ****
Some people are still hard-headed.

--------------------------------------
Mac Pro (Mid 2012) - Two 2.4GHz 6-Core Intel Xeon E5645 processors 64GB 1333MHz DDR3 ECC SDRAM OpenSUSE Leap 15.5
Duxa
DD-WRT User


Joined: 16 Aug 2013
Posts: 191

PostPosted: Sun Jun 19, 2022 18:49    Post subject: Reply with quote
Well, i got some bad news, and then some more bad news.

I updated to build 06-19-2022-r49268, issue occurred after around 36 hours of uptime. Thats first bad news.

Second bad news, sometime during that 36 hours my serial output stopped, and now I cant get anything out of the serial on any of the builds. I think the port is dead. If anyone has any ideas please let me know. When I multimeter the pins I do get 3.3v still vs ground.

I think I need to test this with setting to SFE (default setting that most probably never switch from).

I set it to SFE and will see if issue happen again. Will update sometime within next 4 days. If no issue for 4 days then that would mean CTF is no good on r6300v1 (unstable) and should not be used.

Will report back.

Would like to hear if BS or anyone else looked at the code for this issue?
PaulGo
DD-WRT User


Joined: 01 Dec 2021
Posts: 289
Location: Maryland, United States

PostPosted: Sun Jun 19, 2022 19:31    Post subject: Reply with quote
I have used CTF for many days on my R6300v1 without problems.
Duxa
DD-WRT User


Joined: 16 Aug 2013
Posts: 191

PostPosted: Sun Jun 19, 2022 19:37    Post subject: Reply with quote
PaulGo wrote:
I have used CTF for many days on my R6300v1 without problems.


Happen to know which build?

Also can you give the following info:

- Are both 2.4Ghz and 5.0Ghz radios setup and ON?
- Router providing both DNS and DHCP?
- Do you use dynamicdns?
- Is upnp off?
- Did you manually forward any ports?
- Is SSH enabled?
- Is logging enabled? If so, both syslog and klog? Or just one of them?


That info will give me more things to experiment with.
kernel-panic69
DD-WRT Guru


Joined: 08 May 2018
Posts: 14125
Location: Texas, USA

PostPosted: Sun Jun 19, 2022 20:28    Post subject: Reply with quote
Try removing the ground lead, then powering up, then reattaching ground lead. Then you might want to try rebooting to see if the serial comes back up.
_________________
"Life is but a fleeting moment, a vapor that vanishes quickly; All is vanity"
Contribute To DD-WRT
Pogo - A minimal level of ability is expected and needed...
DD-WRT Releases 2023 (PolitePol)
DD-WRT Releases 2023 (RSS Everything)

----------------------
Linux User #377467 counter.li.org / linuxcounter.net
PaulGo
DD-WRT User


Joined: 01 Dec 2021
Posts: 289
Location: Maryland, United States

PostPosted: Sun Jun 19, 2022 20:43    Post subject: Reply with quote
Happen to know which build? It was a recent build.

Also can you give the following info:

- Are both 2.4Ghz and 5.0Ghz radios setup and ON? Only 5 GHZ
- Router providing both DNS and DHCP? Yes
- Do you use dynamicdns? No
- Is upnp off? On
- Did you manually forward any ports? No
- Is SSH enabled? No
- Is logging enabled? No

CTF is a blob that is provided by Broadcom. I don't believe it's implementation has changed.
Duxa
DD-WRT User


Joined: 16 Aug 2013
Posts: 191

PostPosted: Sun Jun 19, 2022 20:54    Post subject: Reply with quote
PaulGo wrote:


CTF is a blob that is provided by Broadcom. I don't believe it's implementation has changed.


yeah, but it most likely relies on libraries in the kernel, which if updated may introduce incompatibilities.

Like the whole ipv6 (legacy) error is likely due to ipv6 library not being updated but kernel being updated. So its making deprecated calls.
dale_gribble39
DD-WRT Guru


Joined: 11 Jun 2022
Posts: 1899

PostPosted: Sun Jun 19, 2022 21:22    Post subject: Reply with quote
PaulGo wrote:
CTF is a blob that is provided by Broadcom. I don't believe it's implementation has changed.

If this is true, then it is the blob for Linux 2.6.36.4 kernel; for some reason, I *kinda* doubt that it is.
Duxa wrote:
yeah, but it most likely relies on libraries in the kernel, which if updated may introduce incompatibilities.

Like the whole ipv6 (legacy) error is likely due to ipv6 library not being updated but kernel being updated. So its making deprecated calls.

See above. I'm also thinking that ipv6 is kernel version dependent. But enough speculation, conjecture, and possible misinformation. Proof carries more weight.

_________________
"The woods are lovely, dark and deep,
But I have promises to keep,
And miles to go before I sleep,
And miles to go before I sleep." - Robert Frost

"I am one of the noticeable ones - notice me" - Dale Frances McKenzie Bozzio

<fact>code knows no gender</fact>

This is me, knowing I've ruffled your feathers, and not giving a ****
Some people are still hard-headed.

--------------------------------------
Mac Pro (Mid 2012) - Two 2.4GHz 6-Core Intel Xeon E5645 processors 64GB 1333MHz DDR3 ECC SDRAM OpenSUSE Leap 15.5
Goto page Previous  1, 2, 3, 4  Next Display posts from previous:    Page 3 of 4
Post new topic   Reply to topic    DD-WRT Forum Index -> Broadcom SoC based Hardware All times are GMT

Navigation

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You can attach files in this forum
You can download files in this forum