From nobody@FreeBSD.org Thu Nov 9 22:07:53 2006 Return-Path: Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id F08EA16A47E for ; Thu, 9 Nov 2006 22:07:52 +0000 (UTC) (envelope-from nobody@FreeBSD.org) Received: from www.freebsd.org (www.freebsd.org [216.136.204.117]) by mx1.FreeBSD.org (Postfix) with ESMTP id 82D7643D8B for ; Thu, 9 Nov 2006 22:07:39 +0000 (GMT) (envelope-from nobody@FreeBSD.org) Received: from www.freebsd.org (localhost [127.0.0.1]) by www.freebsd.org (8.13.1/8.13.1) with ESMTP id kA9M7cJ8090147 for ; Thu, 9 Nov 2006 22:07:38 GMT (envelope-from nobody@www.freebsd.org) Received: (from nobody@localhost) by www.freebsd.org (8.13.1/8.13.1/Submit) id kA9M7cpJ090146; Thu, 9 Nov 2006 22:07:38 GMT (envelope-from nobody) Message-Id: <200611092207.kA9M7cpJ090146@www.freebsd.org> Date: Thu, 9 Nov 2006 22:07:38 GMT From: meyer To: freebsd-gnats-submit@FreeBSD.org Subject: ath device stopps TX X-Send-Pr-Version: www-3.0 >Number: 105348 >Category: kern >Synopsis: [ath] ath device stopps TX >Confidential: no >Severity: serious >Priority: medium >Responsible: freebsd-wireless >State: feedback >Quarter: >Keywords: >Date-Required: >Class: sw-bug >Submitter-Id: current-users >Arrival-Date: Thu Nov 09 22:10:21 GMT 2006 >Closed-Date: >Last-Modified: Mon Apr 11 11:47:12 UTC 2011 >Originator: meyer >Release: releng_6 >Organization: prowip >Environment: FreeBSD ap-h.matik.com.br 6.2-PRERELEASE FreeBSD 6.2-PRERELEASE #0: Thu Nov 9 07:37:32 BRST 2006 >Description: this is about ath device running hostap with ath_rate_onoe and ath_rate_sample which makes no difference the following is the standard as it happens daily: Nov 8 12:53:53 ap-h kernel: ath2: discard oversize frame (ether type 5e4 flags 3 len 1522 > max 1514) Nov 8 12:54:23 ap-h last message repeated 2 times Nov 8 12:56:25 ap-h last message repeated 4 times Nov 8 12:58:41 ap-h last message repeated 18 times Nov 8 13:00:00 ap-h root: WIP: 135 esta??es conectadas. Nov 8 13:06:54 ap-h kernel: ath2: device timeout athstats shows something like this, mostly interesting "tx stopped" because it really stopped while the card still receive üpload and traffic rx goes through 1974355 data frames received 2199237 data frames transmit 32016 tx frames with an alternate rate 799516 long on-chip tx retries 31093 tx failed 'cuz too many retries 11M current transmit rate 86472 tx management frames 31 tx frames discarded prior to association 57093 tx stopped 'cuz no xmit buffer 64442 tx frames with no ack marked 1167515 tx frames with short preamble 11001938 rx failed 'cuz of bad CRC 14966831 rx failed 'cuz of PHY err 14890917 CCK timing 75914 CCK restart 489724 beacons transmitted 1673 periodic calibrations 13 rssi of last ack 23 avg recv rssi -98 rx noise floor 64369 cabq frames transmitted 117 cabq xmit overflowed beacon interval 28922 switched default/rx antenna Antenna profile: [1] tx 1135988 rx 1099419 [2] tx 1117796 rx 1177433 In this case it is completely not relevant if I set ath_txbuf to whatever, 500, 1000, 5000, 10000 or 30000, same happens alsways and daily twice or trice Interesting is that I boot the exactly same hardware with the same setup and 5.4-R and it is rockstable, the ATH device does not hang up I tried different ath cards and it does not make any difference either the athstats event "tx stopped" always happens but does not always cause a real hang but it seems that the driver do not recover from sporadic "tx stopped" and get counting up and once on a certain level it hangs then. also interesting that I can get the deve back when I am there in time, a ifconfig down and up recovers it, if I am late I need a complete reboot any other system stats are not giving any light but I have them, ask me if you want to know something else I also suspected an if_bridge problem but when i change the ath device for some wi the problem goes away, wi is stable too, as told before, same machine, same external environment Appearently the problem is related to how many stations are associated, 30 seems to be the magic number and 40 seems to be the sure dead count, better explained it means the problem does not happen under 30 stations, it happens sometimes with up to 30 and it happens once or more a dya with over 30 stations associated. >How-To-Repeat: >Fix: >Release-Note: >Audit-Trail: Responsible-Changed-From-To: freebsd-bugs->freebsd-net Responsible-Changed-By: vwe Responsible-Changed-When: Wed Jan 14 23:31:27 UTC 2009 Responsible-Changed-Why: Over to maintainer(s). http://www.freebsd.org/cgi/query-pr.cgi?pr=105348 State-Changed-From-To: open->feedback State-Changed-By: arundel State-Changed-When: Mon Sep 6 23:24:38 UTC 2010 State-Changed-Why: Does this issue still occur on a supported branch? Thanks. http://www.freebsd.org/cgi/query-pr.cgi?pr=105348 Responsible-Changed-From-To: freebsd-net->freebsd-wireless Responsible-Changed-By: adrian Responsible-Changed-When: Mon Apr 11 11:45:13 UTC 2011 Responsible-Changed-Why: shift to freebsd-wireless http://www.freebsd.org/cgi/query-pr.cgi?pr=105348 >Unformatted: