Posted: Wed Oct 26, 2022 17:12 Post subject: Potential bug in the radio driver in some TP-Link devices?
I posted the following to the build info yesterday for a TP-Link ARCHER-A7 v5:
Previously running a build from October 2021
"With this build the router flashes it fine and sets up, I'm able to do a factory reset with the new build with no issues. However, over a 24 hour period the wireless radios start getting more unstable, after 6 hours or so the 5Ghz radio stops advertising it's SSID and working completely, and by 12 hours the 2.4Ghz radio is still advertising but it won't allow any more connections.
Airtime Fairness seems to make no difference either way so I don't see any advantage to disabling it.
Setting Disassoc Low Ack seems to have fixed the problem however I then decided to load OpenWRT version 22.03.2 on the device to see if that had the same problem - and it DID NOT. Disassociate on Low Acknowledgement is also on by default in that firmware.
I dug into this a bit further and ran across a post here with the identical symptoms:
It appears that users in the first thread discovered:
"replaced the ath10k-ct driver with the generic ath10k one. And after a week having zero issues i can only come to the conclusion: while ath10k is very stable, ath10k-ct has serious issues."
"It was caused by a regression in upstream kernel module (mac80211), which regressions only affected some wifi drivers. nbd specified mwlwifi and ath9k in the commit message... The patch fixes this upstream commit:"
In summary, OpenWRT had an older driver in version 19 that had no issues, a newer kernel seems to have broken that, they then fixed the ath9k driver for that kernel which then appears to have broken certain TP-Link devices. Then later, they found that a binary blob ath9k driver from some manufacturer fixes this?? I did not deeply follow what the OpenWRT project did - but clearly they stumbled into whatever the problem was.
At least one other user noted that turning the Disassociate option off "fixed" the problem in OpenWRT
The old dd-wrt builds in 2021 had no problem on this hardware - so I assume they used an older driver, older kernel, both of which worked together. Since then the kernel was updated, broke the dd-wrt driver on the TP-Link hardware, just as what happened with OpenWRT, yet somehow they fixed it with some jiggery pokery on their firmware for the TP-Link Archer. Meanwhile shutting off this option also seems to fix it in dd-wrt.
Please consider showing your router logs and other bits and pieces that would tell us what driver version and firmware version the wifi hardware is using (which it's doubtful that it's the candelatech driver/firmware). _________________ "The woods are lovely, dark and deep,
But I have promises to keep,
And miles to go before I sleep,
And miles to go before I sleep." - Robert Frost
"I am one of the noticeable ones - notice me" - Dale Frances McKenzie Bozzio
The nbd mac80211 regression fix patch has been applied to DD-WRT since December 2021.
5 GHz QCA9880 radio has Advanced checkbox, FW Type VANILLA which should be selected.
Try reverting to safer settings closer to default in general excluding DD-WRT firmware type.
Firmware Type VANILLA, Protection Mode RTS/CTS None, RTS Threshold Disable, Beacon 100 and DTIM 2,
Wireless Network Mode Mixed, TurboQAM (QAM256) Disable, Sensitivity Range (ACK Timing) 900 or 1350.
Any wireless settings non-defaults please describe in full detail otherwise logs may be helpful if exhausted.
I don't generally test builds on anything other than what dd-wrt sets as it's default settings.
I don't believe the default on this router model is Vanilla. At the moment the router is running OpenWRT so I can't check the defaults. I have one client, a non-profit, that uses OpenWRT routers exclusively as access points (and they use a lot of them, the building they are in is -extremely- wifi unfriendly) and the reality is that it's kind of difficult to find a "pure" router (on the used market) that runs OpenWRT without issues. The OpenWRT team is absolutely horrible about backwards compatibility and I have several devices in my "to be debricked" pile that have newer OpenWRT versions flashed to them but the code is so bloated that the routers run so slowly that you can't even login to them. So this particular device may end up there and NEVER run dd-wrt again. Complaints about breaking older devices to the OpenWRT team generally fall on deaf ears, their usual go-to is "setup a build environment and build your own customized version of OpenWRT with the bloat removed" Yeah right I just love to chase bugs that not only you guys introduce but I do in my own builds...NOT.
The reason I flagged this is that the bug in this case isn't that the dd-wrt driver on this hardware is borked (It may or may not be) but rather that the bug in this case is that the DEFAULTS in the dd-wrt code set the device up in a way that cause this bug to show up.
Remember that there's newbies to dd-wrt that are not as anal-retentive as I am and won't spend several hours chasing down esoteric option settings for their border-case hardware. My hope is that if one of them has one of these devices, flashes it with new code, and has this issue with the radios show up that they will search and come across this post, make the change, then be happy with dd-wrt instead of getting frustrated and turning their back on it.
If Brainslayer wants to change the default settings for this particular hardware that's his affair. My testing shows that he should but in the past I've thrown many hours into arguing for default settings changes that would avoid bugs, and it's been a major fight to get them into the code so nowadays I'm not going to carry the torch for that anymore. Instead I'll just raise the issue, and if someone else with one of these units wants to spend the hours needed testing and posting logs to make you guys happy, more power to them.
This is not OpenWRT. If defaults were changed to optimal settings, there would be no opportunity to learn anything, and everyone already knows the defaults are not optimal. Still zero evidence of your claims, and I am sure any other firmware / appliances you manage don't require any configuration out of the box <eyeroll> _________________ "The woods are lovely, dark and deep,
But I have promises to keep,
And miles to go before I sleep,
And miles to go before I sleep." - Robert Frost
"I am one of the noticeable ones - notice me" - Dale Frances McKenzie Bozzio
If defaults were changed to optimal settings, there would be no opportunity to learn anything, and everyone already knows the defaults are not optimal.
And this is a stated goal of this project? Funny I don't see it anywhere on the website LOL.
Newbies don't know the defaults on this hardware are not optimal and if I hadn't posted this they still wouldn't even if they had cared to spend the time experimenting. Do you really want this project to end up like OpenWRT - essentially a walled garden of a place where unless you have Blessed Hardware they seem to delight in saying how Broadcom gear is a pile of garbage and quit bothering us and run Tomato or something?
I don't think most people loading alternative firmware on routers expect it to work out of the box but when they have a problem they are going to go searching the forums. Now they have something that will come up on a hit on this hardware. Before, they wouldn't.
As for proof, I'll point out that this is after all marginal hardware anyway, hardly worth the $9.99 or whatever I paid for it when I ran across it in a surplus bin somewhere and it runs OpenWRT barely adequately also. It's certainly not a platform that you would use to show off the advantages of alternative firmware. It is what it is, an elderly device that is perfectly adequate to use as a Wireless Access Point but hardly what I'd recommend for serious routing work. It's biggest selling point is it has a usable TFTP server in the boot loader so it's easy to unbrick.
If you are really serious about using dd-wrt in production as has been said before, you buy yourself Nighthawk 7000's or something which are readily and cheaply available on the used market, have adequate cooling, plenty of ram, hardware accelleration, and are very easy to load dd-wrt on.
And as for others messed up defaults, you cannot possibly be recommending that the goal of this project is to be as screwed up out of the box as, say, Microsoft Windows is LOL
As I said, I got a lot more contentment out of just giving up trying to get Brainslayer and others creating the builds to change default settings to make it better, and merely reporting on border cases as I find them. Dev time is limited and my hope is that the devs would not spend inordinate amounts of time verifying bugs on old sketchy hardware in any case. I don't intend to spend a week on this device with different settings proving out anything. Take this post or leave it, as it stands. With luck someone with more time than me and fewer devices will see this and it will help them.
The defaults are blanket across the board by platform. To change these for specific hardware on similar devices would require conditional compile options in the code and code size / reduction is a key factor in ongoing development, if you weren't already aware. Quite honestly, this thread seems to be becoming a Chicken Little attempt at bikeshedding and as such should be avoided and addressed by someone far more internal to the food chain or company than I. I should refrain from commenting at all, or further comment towards you, as I am almost certain that a titanium wiffleball bat would smack the forum rules quite profusely if I do... if it hasn't already. _________________ "The woods are lovely, dark and deep,
But I have promises to keep,
And miles to go before I sleep,
And miles to go before I sleep." - Robert Frost
"I am one of the noticeable ones - notice me" - Dale Frances McKenzie Bozzio