nvram show from cfe trunkates output and locks up router

Post new topic   Reply to topic    DD-WRT Forum Forum Index -> Broadcom SoC based Hardware
Author Message
thenextdon13
DD-WRT User


Joined: 04 Nov 2006
Posts: 89
Location: The Dalles, Oregon USA

PostPosted: Sun Sep 20, 2009 8:06    Post subject: nvram show from cfe trunkates output and locks up router Reply with quote
While trying to troubleshoot wl500w bricking issues, i ran into an interesting situation. (12533 mega)

In brief, nvram show from cfe not only causes the router to stop responding and require a reboot (after some output), but is also massively truncated versus nvram show from ssh console.

I am particularly interested in seeing someone do the following with both another asus wl500w as well as some other make/model of router.

Can someone with serial console access to their router do the following for me?
-Obtain serial access
-Break startup by holding cntl+c while plugging in router until ^c characters start showing on screen.
-run nvram show.
-Take note of last couple of variables displayed and/or (preferably) copy entire output to file
-Did the router stop responding to keyboard input?
-Hard reboot the router (if locked up, otherwise 'reboot' command)
-log in via ssh or telnet
- run nvram show from linux prompt
- take note of the last couple of variables shown and/or (preferably) copy output to file
Then
-- Compare two files / last variables(cfe vs linux console nvram show)
-- Do they match?


I don't *think* that nvram show in cfe should crash the router or cfe, but i am not sure.

I can't help but wonder if there is some CFE or other system bug that occurs and is causing the corrupted nvram and thus bricking.

Ideas/Thoughts/Suggestions....?
Sponsor
thenextdon13
DD-WRT User


Joined: 04 Nov 2006
Posts: 89
Location: The Dalles, Oregon USA

PostPosted: Wed Sep 23, 2009 4:04    Post subject: Need cfe nvram show output from non wl500w router Reply with quote
Re-titling thread to focus more what i am need of...

The Asus wl500w locks up hard after an nvram show from cfe.

This wouldn't appear to be normal... but am looking for comfirmation..

thanks!
thenextdon13
DD-WRT User


Joined: 04 Nov 2006
Posts: 89
Location: The Dalles, Oregon USA

PostPosted: Wed Sep 23, 2009 5:38    Post subject: Reply with quote
Looking through the source for the router, I think I may have found that there could be a hard limit for the nvram show length depending on a set of variables set during compile by Asus... (if asus tweaked the variables in the Broadcom code at all)

First, lets look at the code that you are actually calling form the command line when you run nvram show

From GPL_WL_500W_2006/WL500W/src/cfe/cfe/arch/mips/board/bcm947xx/src/
Code:

static int
ui_cmd_nvram(ui_cmdline_t *cmd, int argc, char *argv[])
{
        char *command, *name, *value, *buf;
...
...
...
 else if (!strcmp(command, "show") || !strcmp(command, "getall")) {
                if (!(buf = KMALLOC(NVRAM_SPACE, 0)))
                        return CFE_ERR_NOMEM;
                nvram_getall(buf, NVRAM_SPACE);
                for (name = buf; *name; name += strlen(name) + 1)
                        printf("%s\n", name);
                size = sizeof(struct nvram_header) + (name - buf);
                printf("size: %d bytes (%d left)\n", size, NVRAM_SPACE - size);
                KFREE(buf);
        }


So, for the nvram 'show' command, it appears the code checks if there is enough memory to buffer the output nvram, and if so it calls nvram_getall including buf and NVRAM_SPACE.
Once it has retrieved nvram using that function, it cycles through formatting and printing it with the for statement.

This leads to 2 questions...
1. What is the definition of nvram_getall and
2. Where is NVRAM_SPACE defined?

Ok, so found part of the nvram_getall here:
GPL_WL_500W_2006/WL500W/src/include/bcmnvram.h
Code:

/*
 * Get all NVRAM variables (format name=value\0 ... \0\0).
 * @param       buf     buffer to store variables
 * @param       count   size of buffer in bytes
 * @return      0 on success and errno on failure
 */
extern int nvram_getall(char *nvram_buf, int count);

/*
 * returns the crc value of the nvram
 * @param       nvh     nvram header pointer
 */
uint8 nvram_calc_crc(struct nvram_header * nvh);

#endif /* _LANGUAGE_ASSEMBLY */

/* The NVRAM version number stored as an NVRAM variable */
#define NVRAM_SOFTWARE_VERSION  "1"

#define NVRAM_MAGIC             0x48534C46      /* 'FLSH' */
#define NVRAM_CLEAR_MAGIC       0x0
#define NVRAM_INVALID_MAGIC     0xFFFFFFFF
#define NVRAM_VERSION           1
#define NVRAM_HEADER_SIZE       20
#define NVRAM_SPACE             0x8000

#define NVRAM_MAX_VALUE_LEN 255
#define NVRAM_MAX_PARAM_LEN 64

#define NVRAM_CRC_START_POSITION        9 /* magic, len, crc8 to be skipped */
#define NVRAM_CRC_VER_MASK      0xffffff00 /* for crc_ver_init */

#ifdef  __cplusplus
}
#endif


#endif /* _bcmnvram_h_ */


Notice the hard-coded variable here of #define NVRAM_SPACE=0x8000 ... i have no idea what is going on here as there doesn't appear to be any action on the count variable or the *nvram_buf pointer.

How much space does that actually consist of? Well, 8000 hex is 32768... but does that tell us anything..? Is it Bits? Bytes?

Ah, in another place we seem to have the definition of nvram_getall which actually takes action on the variables ( think the two definitions work together in some way):
GPL_WL_500W_2006/WL500W/src/linux/linux/arch/mips/brcm-boards/bcm947xx/nvram_linux.c

Code:

nvram_getall(char *buf, int count)
{
        unsigned long flags;
        int ret;

        spin_lock_irqsave(&nvram_lock, flags);
        ret = _nvram_getall(buf, count);
        spin_unlock_irqrestore(&nvram_lock, flags);

        return ret;
}



which calls _nvram_getall in GPL_WL_500W_2006/WL500W/src/shared/nvram.c....
Code:

_nvram_getall(char *buf, int count)
{
        uint i;
        struct nvram_tuple *t;
        int len = 0;

        bzero(buf, count);

        /* Write name=value\0 ... \0\0 */
        for (i = 0; i < ARRAYSIZE(nvram_hash); i++) {
                for (t = nvram_hash[i]; t; t = t->next) {
                        if ((count - len) > (strlen(t->name) + 1 + strlen(t->value) + 1))
                                len += sprintf(buf + len, "%s=%s", t->name, t->value) + 1;
                        else
                                break;
                }
        }

        return 0;
}



Ok, so if you follow through these, "NVRAM_SPACE" appears to be initially hard-set by the #define statement (bcmnvram.h) then passed into nvram_getall (nvram_linux.c) as variable 'count', and passed again to _nvram_getall (nvram.c), where nvram is actually iterated through.

I do not quite understand how theh _nvram_getall is using NVRAM_SPACE, but it does appear to use it for the comparison as to when to stop iterating through nvram.

But what is the NVRAM_SPACE variable actually set to? Is it set statically to 0x8000 by the above include file?

A bit more looking for how the variable was used, and found this little gem...
Code:

cfe/cfe/arch/mips/board/bcm947xx/src/ui_bcm947xx.c:        printf("size: %d bytes (%d left)\n", size, NVRAM_SPACE - size);


What this tells us is that the output from nvram show designating free space is a calculation made using that same NVRAM_SPACE variable.

From the dd-wrt command prompt, nvram show works fine on the Asus wl500w-- and provides a number at the end designating some free space 'left', and it even adds up to 32768, which is the above set variable.... answering the question that 0x8000 is number of bytes.
Code:

size: 29709 bytes (3059 left)


Does dd-wrt rely on CFE to call nvram functions? I wouldn't think it does as CFE is a bootloader...

If dd-wrt doesn't interface with CFE for nvram functions, then the variable set for NVRAM_SPACE during the compliation of code could be different between CFE and dd-wrt code, right?

In the case of truncated output, that could be what is biting us here... the NVRAM_SPACE variable was set in CFE to be much smaller than dd-wrt uses.

Could this cause other issues with dd-wrt flashing to nvram? We need to look more closely into where else NVRAM_SPACE is used...

Of course, this is all just some educated guessing and conjecture..

Thoughts? Suggestions? Commments?
thenextdon13
DD-WRT User


Joined: 04 Nov 2006
Posts: 89
Location: The Dalles, Oregon USA

PostPosted: Wed Sep 23, 2009 17:10    Post subject: Reply with quote
Thinking about this..

Tonight if i have time i will reflash the router to factory firmware and see how nvram show works at CFE.

If it iterates entirely through nvram it will output and show us the total used and available, letting us know what NVRAM_SPACE CFE was complied with.
thenextdon13
DD-WRT User


Joined: 04 Nov 2006
Posts: 89
Location: The Dalles, Oregon USA

PostPosted: Thu Sep 24, 2009 3:19    Post subject: Reply with quote
Okay,

Did a series of troubleshooting tonight including flashing back to stock firmware, checking cfe nvram show, upgrading firmware and doing the same, changing settings and doing the same.

The result is that something in the closing of the ssh and vpn certs / keys is crashing the nvram show in cfe. In one variable's case it actually caused a panic/reboot.

This indicates we need to figure out what characters are causing the problem... note that it did not happen on the openvpn_dh variable.

I think this type of problem __May__ still be causing the bricking, but its hard to tell if this is *normal* on most routers...

No one else can help? Sad All i need is for them to set up ssh access so that nvram includes sshd_dss_host_key then serial in to cfe and do an nvram show....

More detailed information below.

Reflashed the router using the wl500g-clear-nvram.trx, wl500g-recover.trx then the factory WL500W_2.0.0.6_EN_CN_TW_DE_KR.trx.

Confirmed that nvram show from cfe would complete successfully, and checked the NVRAM_SPACE variable by adding the used and left.

Code:

...
url_date_x=1111111
ddns_server_x=
wl_antdiv=-1
usb_bannum_x=0
size: 8522 bytes (24246 left)
*** command status = 0
CFE> 8522 + 24246 = 32768 -- Not the problem!? Okay...


So the problem is not a compile time variable change, since 32768 is the same value that dd-wrt uses. Hmm Well, lets get into dd-wrt and see what happens with cfe nvram show...

Flashed with dd-wrt.v24_mini_asus.trx to see if nvram show in cfe would work...
Code:

...
rc_custom=
filter_dport_grp2=
wl_antdiv=-1
usb_bannum_x=0
size: 22570 bytes (10198 left)
*** command status = 0
CFE> 22570 + 10198 = 32768

nvram show from cfe still completes successfully and adds to 32768


Flashed dd-wrt.v24_mega_generic.bin (build 12533) to see if nvram show in cfe would still work
Code:

filter_dport_grp2=
rc_custom=
wl0_wds9_ipaddr=
pptpd_client_mru=1450
size: 17042 bytes (15726 left)
*** command status = 0
CFE> 17042 + 15726 = 32768

nvram show from cfe still completes successfully (and adds to 32768)

Used the cfe 'save' command to save a copy of all of the area defined as text by cfe...
Code:

CFE> save 192.168.11.10:nvram_default_mega 0x80800000 203184
TFTP Client.
2109828 bytes written to 192.168.11.10:nvram_default_mega



Ok, so now cfe nvram show is working just fine.. time to start configuring...
First, lets configure my standard stuff on the services page...
Does it still work?

Ooo it does not-- freezes after sshd_dss_host_key
Code:

wl_macmode1=disabled
sshd_dss_host_key=-----BEGIN DSA PRIVATE KEY-----
MIIBuwIBAAKBgQC3GYmV84Wa/56fYS+wx4wTTAp54vGdoI2xb1nCUQpZuOeT0c1d
/n0KhkPXvTv9suEcnIZeBq9vvXyuGERII/446xCDx2cW4G0kmmYQxmgeBc4rX486
YVjs+xYkGLYszH02jIShq4/1rTvoXlE5BzVlZ84Wviq71WQVWz8Dn14vLwIVALxd
wMzoavKUESDNJHsTONePcbnDAoGADn4kJoyI3PKErQyBjAjh39qvzGob+fcGoaR/
8XCFXQgrUhcUeMrx+quHM3TI9fzV2m7mclCptspMtYV/L+IHSOfeqnr6WwyDFyVR
k1okkhFPwrP9oZp4a+YlPkAp+kUAtvjhFKBgnM3/JylhpUMCLOx6vxOkhH2FdksZ
fnZbT80CgYEAhA0/iSxXAKSHPZwl4o8WPijXqV6PigmgSgHxM+A9kAf+8SO5z8Mi
PGjDem/sh3r2ltL/bXjo1S8NkaQvIXM4fXSBYM0/wmQffRu5qRFQNKOxWVcET9dU
CEembRnovO3LTxgoIwigsJ2xSj+Q1apOFGxOXoZqMXOc0eO4fBIs0FoCFHE8Z9SP
lQ2Qu06/HS8hphjsufoS
-----END DSA PRIVATE KEY-----

(i manually rebooted here)

CFE version 1.0.37 for BCM947XX (32bit,SP,LE)






Lets try resetting defaults and then not enabling ssh...
Code:

...
oet1_fragment=0
oet5_rem=192.168.90.1
size: 22368 bytes (10400 left)
*** command status = 0
CFE> reboot


Works...., ok, lets try some other key inputs-- i.e. openvpn


Code:

openvpn_client=-----BEGIN CERTIFICATE-----
MIIEIjCCA4ugAwIBAgIBAjANBgkqhkiG9w0BAQUFADCBnzELMAkGA1UEBhMCVVMx
...
...
...
6fP9uNR4
-----END CERTIFICATE-----


Oops, that broke it... dies on End Certificate line of openvpn_client variable:

OK, lets remove that from the config and reboot...

Code:

...
eth1_netmask=0.0.0.0                                                                                                                                         
openvpn_key=-----BEGIN RSA PRIVATE KEY-----                                                                                                                 
MIICXAIBAAKBgQDfemd3/k7qqohuA2JKFzGm3taPgRBMir7y39NG7u6r
...
...
UQZ5KOAVa1c94vKBk+d2hbVILX8bJWWoOC2VKqp5JzU=                                                                                                                 
-----END RSA PRIVATE KEY-----       

Now hangs at the end of openvpn_key variable.. remove that and lets see where we get..

Code:

openvpn_ca=-----BEGIN CERTIFICATE-----                                                                                                                       
MIIDyDCCAzGgAwIBAgIJALY+Q7TTKF+qMA0GCSqGSIb3DQEBBQUAMIGfMQswCQYD                                                                                             
...
...
Ca8XF8QrTBRKwvndSdBmXuM2puZOzw6xJzwDo1RToOgBjYtEeEKvIOSONt+zpvGB                                                                                             
DJ1Q2X27EY27KCbI                                                                                                                                             
-----END CERTIFICATE-----                                                                                                                                   
**Exception 8: EPC=7A636C78, Cause=00008008 (TLBMissRd)                                                                                                     
                RA=7A636C78, VAddr=7A636C78                                                                                                                 
                                                                                                                                                             
        0  ($00) = 00000000     AT ($01) = 80830000                                                                                                         
        v0 ($02) = 0000057C     v1 ($03) = 00000000                                                                                                         
        a0 ($04) = 00000000     a1 ($05) = 8089B190                                                                                                         
        a2 ($06) = 80835888     a3 ($07) = 00000000                                                                                                         
        t0 ($08) = 8089B7DC     t1 ($09) = 80835858                                                                                                         
        t2 ($10) = 00000000     t3 ($11) = 00000005                                                                                                         
        t4 ($12) = B8000000     t5 ($13) = 000000C0                                                                                                         
        t6 ($14) = 00000000     t7 ($15) = 00000000                                                                                                         
        s0 ($16) = 47626852     s1 ($17) = 8085A670                                                                                                         
        s2 ($18) = FFFFFFFD     s3 ($19) = 8089B6C8                                                                                                         
        s4 ($20) = 00000080     s5 ($21) = 00000000                                                                                                         
        s6 ($22) = 00000000     s7 ($23) = 00000001                                                                                                         
        t8 ($24) = 10000000     t9 ($25) = 00000000                                                                                                         
        k0 ($26) = 79564764     k1 ($27) = 676B6E63                                                                                                         
        gp ($28) = 808399B0     sp ($29) = 8089B468                                                                                                         
        fp ($30) = 00000000     ra ($31) = 7A636C78                                                                                                         
                                                                                                                                                             
                                                                                                                                                             
                                                                                                                                                             
CFE version 1.0.37 for BCM947XX (32bit,SP,LE)                                                                                                               
Build Date: |  7� 26 16:41:16 CST 2007 (root@localhost.localdomain)                                                                                         
Copyright (C) 2000,2001,2002,2003 Broadcom Corporation.                                                                                                     
                                               
...


Ah, interesting-- the openvpn_ca causes a kernel exception and the router immediately reboots!


So, what happens if we use cfe to do an nvram get of openvpn_ca...?
Code:

CFE> nvram get openvpn_ca                                                                                                                                   
-----BEGIN CERTIFICATE-----                                                                                                                                 
MIIDyDCCAzGgAwIBAgIJALY+Q7TTKF+qMA0GCSqGSIb3DQEBBQUAMIGfMQswCQYD                                                                                             
...
...
...
DJ1Q2X27EY27KCbI                                                                                                                                             
-----END CERTIFICATE-----                                                                                                                                   


Router locks up!

Okay, how about an nvram unset?
[code]
CFE> nvram unset openvpn_ca
*** command status = 0
CFE> nvram get openvpn_ca
*** command status = 0
CFE>
[code]


Well, ok-- lets see now if nvramshow will run...
[code]
...
wl0_wds9_ipaddr=
rc_custom=
filter_dport_grp2=
size: 23279 bytes (9489 left)
*** command status = 0
CFE>
[/code]
YUP!
ok, lets commit change ... and dump memory...

[code]
CFE> nvram commit
*** command status = 0
CFE> save 192.168.11.10:nvram_vpn_enabled_minus_client_public_cert_and_private_key_and_server_public_cert 0x80800000 203184
TFTP Client.
2109828 bytes written to 192.168.11.10:nvram_vpn_enabled_minus_client_public_cert_and_private_key_and_server_public_cert
*** command status = 0
CFE> reboot
[/code]

So, what character or series of characters is it that is causing cfe to die, and is it causing the bricking?

Comments/Thoughts/Ideas?
thenextdon13
DD-WRT User


Joined: 04 Nov 2006
Posts: 89
Location: The Dalles, Oregon USA

PostPosted: Thu Sep 24, 2009 3:42    Post subject: Reply with quote
P.S. it appears to not be the key ending alone, but some combination of key/cert chars and the ending pattern...

If the entire certs are replaced with just the header/footer and 'aaaa', nvram show in CFE works fine...

Code:

...
...
nstx_ipenable=0
openvpn_ca=-----BEGIN CERTIFICATE-----                                                                                                                       
aaaa                                                                                                                                                         
-----END CERTIFICATE-----   
wl_wme_txp_be=7 3 4 2 0                                                                                                                                     
...
...


Weird bug... and questionable as to whether it is causing the bricking or not, but i HAVE had the most problems with bricking on my routers while i was configuring VPN on the services page.....
Drats
DD-WRT User


Joined: 01 Feb 2007
Posts: 138
Location: Wherever the boat takes me.

PostPosted: Thu Sep 24, 2009 14:47    Post subject: Reply with quote
Is it possible that the private key or private cert contains one of the following characters:
", $, `, and \

_________________
Ray

Asus RT-N66U B1, AP Router, Merlin Version 3.0.0.4.376.49_4, bl_version=1.0.1.3
Asus WL-500W, DD-WRT v24-sp2 (05/27/14) mini-usb-ftp - build 24160
Buffalo WHR-G54S, Repeater:
_____ broadcom, DD-WRT v24-sp2 (03/29/14) mini - build 23838, AutoAP (2013-10-01)
Buffalo WHR-HP-G54, Repeater:
_____ broadcom, DD-WRT v24-sp2 (03/29/14) mini - build 23838, AutoAP (2013-10-01)
LinkSys WRT54G-V2, Repeater:
_____ broadcom, DD-WRT v24-sp2 (01/17/15) mini - build 25948 , AutoAP (2013-10-01)
thenextdon13
DD-WRT User


Joined: 04 Nov 2006
Posts: 89
Location: The Dalles, Oregon USA

PostPosted: Fri Sep 25, 2009 2:22    Post subject: Reply with quote
Ray-
On the first key i see killing nvram (at this point
sshd_dss_host_key) I don't see any , $ or \

However there are plenty of / and +

What are you chasing after?

thanks!
Drats
DD-WRT User


Joined: 01 Feb 2007
Posts: 138
Location: Wherever the boat takes me.

PostPosted: Fri Sep 25, 2009 3:47    Post subject: Reply with quote
Those are characters that many programmers have trouble with, especially linux. I do know that dd-wrt chokes on them if they are used as a value in nvram.

There are others, but those are the biggest problems. They are legal characters, just not programmer friendly usually requiring special handling when used as variables.

_________________
Ray

Asus RT-N66U B1, AP Router, Merlin Version 3.0.0.4.376.49_4, bl_version=1.0.1.3
Asus WL-500W, DD-WRT v24-sp2 (05/27/14) mini-usb-ftp - build 24160
Buffalo WHR-G54S, Repeater:
_____ broadcom, DD-WRT v24-sp2 (03/29/14) mini - build 23838, AutoAP (2013-10-01)
Buffalo WHR-HP-G54, Repeater:
_____ broadcom, DD-WRT v24-sp2 (03/29/14) mini - build 23838, AutoAP (2013-10-01)
LinkSys WRT54G-V2, Repeater:
_____ broadcom, DD-WRT v24-sp2 (01/17/15) mini - build 25948 , AutoAP (2013-10-01)
Display posts from previous:    Page 1 of 1
Post new topic   Reply to topic    DD-WRT Forum Forum Index -> Broadcom SoC based Hardware All times are GMT

Navigation

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You can attach files in this forum
You can download files in this forum