From nobody@FreeBSD.org Sat Jan 1 01:02:23 2011 Return-Path: Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id B1EFC106564A for ; Sat, 1 Jan 2011 01:02:23 +0000 (UTC) (envelope-from nobody@FreeBSD.org) Received: from red.freebsd.org (unknown [IPv6:2001:4f8:fff6::22]) by mx1.freebsd.org (Postfix) with ESMTP id A12548FC12 for ; Sat, 1 Jan 2011 01:02:23 +0000 (UTC) Received: from red.freebsd.org (localhost [127.0.0.1]) by red.freebsd.org (8.14.4/8.14.4) with ESMTP id p0112NVM063555 for ; Sat, 1 Jan 2011 01:02:23 GMT (envelope-from nobody@red.freebsd.org) Received: (from nobody@localhost) by red.freebsd.org (8.14.4/8.14.4/Submit) id p0112NJf063554; Sat, 1 Jan 2011 01:02:23 GMT (envelope-from nobody) Message-Id: <201101010102.p0112NJf063554@red.freebsd.org> Date: Sat, 1 Jan 2011 01:02:23 GMT From: Raphael Kubo da Costa To: freebsd-gnats-submit@FreeBSD.org Subject: iwn: Network keeps disconnecting when /etc/rc.d/netif restart is run X-Send-Pr-Version: www-3.1 X-GNATS-Notify: >Number: 153594 >Category: kern >Synopsis: [iwn] Network keeps disconnecting when /etc/rc.d/netif restart is run >Confidential: no >Severity: non-critical >Priority: medium >Responsible: bschmidt >State: feedback >Quarter: >Keywords: >Date-Required: >Class: sw-bug >Submitter-Id: current-users >Arrival-Date: Sat Jan 01 01:10:12 UTC 2011 >Closed-Date: >Last-Modified: Thu Jan 20 00:50:10 UTC 2011 >Originator: Raphael Kubo da Costa >Release: FreeBSD 8.2-PRERELEASE >Organization: >Environment: FreeBSD gibbon 8.2-PRERELEASE FreeBSD 8.2-PRERELEASE #23: Wed Dec 29 01:41:46 BRST 2010 root@gibbon:/usr/obj/usr/src/sys/GIBBON amd64 >Description: My wireless network card is an Intel PRO/Wireless 5100, and I'm using the iwn driver. /etc/rc.conf contains the following: wlans_iwn0="wlan0" ifconfig_wlan0="WPA SYNCDHCP" And /etc/wpa_supplicat.conf has the appropriate settings for some access points. When the system boots, the network is established correctly, but whenever I need to restart it via '/etc/rc.d/netif restart', when I ping my access point around 10 packets are sent before the network goes down and 'ifconfig wlan0' shows it is looking for different APs (or even the same AP in diverse channels, for example). When a connection is established to the AP again, it goes down after a few seconds again. If I do '/etc/rc.d/netif restart' again, the connection stops dropping. >How-To-Repeat: /etc/rc.d/netif restart >Fix: >Release-Note: >Audit-Trail: Responsible-Changed-From-To: freebsd-bugs->freebsd-net Responsible-Changed-By: linimon Responsible-Changed-When: Mon Jan 3 19:10:15 UTC 2011 Responsible-Changed-Why: Over to maintainer(s). http://www.freebsd.org/cgi/query-pr.cgi?pr=153594 State-Changed-From-To: open->suspended State-Changed-By: bschmidt State-Changed-When: Mon Jan 3 19:59:38 UTC 2011 State-Changed-Why: This is known issue. There is race in devd and our rc-subsystem if wpa_supplicant is involved effectivly resulting in starting wpa_supplicant twice. Both instances try to take over the wlan device which results in what you are seeing. I have no idea how to fix this right now, so this has to wait until I'm able to think of proper fix. As a workaround, don't use netif restart but kldunload if_iwn; kldload if_iwn instead. Responsible-Changed-From-To: freebsd-net->bschmidt Responsible-Changed-By: bschmidt Responsible-Changed-When: Mon Jan 3 19:59:38 UTC 2011 Responsible-Changed-Why: over to me http://www.freebsd.org/cgi/query-pr.cgi?pr=153594 From: Eugene Grosbein To: bug-followup@FreeBSD.ORG Cc: Bernhard Schmidt Subject: Re: kern/153594: [iwn] Network keeps disconnecting when /etc/rc.d/netif restart is run Date: Tue, 04 Jan 2011 14:08:24 +0600 > There is race in devd and our rc-subsystem if wpa_supplicant is involved > effectivly resulting in starting wpa_supplicant twice. Both instances try > to take over the wlan device which results in what you are seeing. > I have no idea how to fix this right now, so this has to wait until I'm able > to think of proper fix. Perhaps, wrapping wpa_supplicant invocation into "lockf -t0" would help to eliminate race? Eugene Grosbein From: Bernhard Schmidt To: Eugene Grosbein Cc: bug-followup@freebsd.org, freebsd-net@freebsd.org Subject: Re: kern/153594: [iwn] Network keeps disconnecting when /etc/rc.d/netif restart is run Date: Tue, 4 Jan 2011 10:06:05 +0100 On Tuesday, January 04, 2011 09:08:24 Eugene Grosbein wrote: > > There is race in devd and our rc-subsystem if wpa_supplicant is involved > > effectivly resulting in starting wpa_supplicant twice. Both instances try > > to take over the wlan device which results in what you are seeing. > > I have no idea how to fix this right now, so this has to wait until I'm > > able to think of proper fix. > > Perhaps, wrapping wpa_supplicant invocation into "lockf -t0" would help > to eliminate race? Possibly, but I don't think this is the way to go. Currently wpa_supplicant has this code: /* * Mark the interface as down to ensure wpa_supplicant has exclusive * access to the net80211 state machine, do this before opening the * route socket to avoid a false event that the interface disappeared. */ if (getifflags(drv, &flags) == 0) (void) setifflags(drv, flags &~ IFF_UP); This code works such that it will send an event to already running wpa_supplicant instances which will then terminate. This does indeed work if there's enough delay between invocations, though, if there is just a small delay (~100ms or something), that event doesn't get passed probably. I think we should start looking into possible solution at that point, trying to figure out why the the event doesn't get passed (probably because the interface is not yet up at that point) will get us closer to proper solution. -- Bernhard From: Eugene Grosbein To: bschmidt@freebsd.org Cc: bug-followup@freebsd.org, freebsd-net@freebsd.org Subject: Re: kern/153594: [iwn] Network keeps disconnecting when /etc/rc.d/netif restart is run Date: Tue, 04 Jan 2011 15:09:15 +0600 On 04.01.2011 15:06, Bernhard Schmidt wrote: >> Perhaps, wrapping wpa_supplicant invocation into "lockf -t0" would help >> to eliminate race? > > Possibly, but I don't think this is the way to go. > > Currently wpa_supplicant has this code: > /* > * Mark the interface as down to ensure wpa_supplicant has exclusive > * access to the net80211 state machine, do this before opening the > * route socket to avoid a false event that the interface disappeared. > */ > if (getifflags(drv, &flags) == 0) > (void) setifflags(drv, flags &~ IFF_UP); > > This code works such that it will send an event to already running > wpa_supplicant instances which will then terminate. This does indeed work if > there's enough delay between invocations, though, if there is just a small > delay (~100ms or something), that event doesn't get passed probably. I think > we should start looking into possible solution at that point, trying to figure > out why the the event doesn't get passed (probably because the interface is > not yet up at that point) will get us closer to proper solution. Proper fine-grained locking was always good solution for race problem :-) How about using flock(2) in wpa_supplicant source code? Eugene Grosbein From: Bernhard Schmidt To: Eugene Grosbein Cc: bug-followup@freebsd.org, freebsd-net@freebsd.org Subject: Re: kern/153594: [iwn] Network keeps disconnecting when /etc/rc.d/netif restart is run Date: Tue, 4 Jan 2011 10:39:47 +0100 On Tuesday, January 04, 2011 10:09:15 Eugene Grosbein wrote: > On 04.01.2011 15:06, Bernhard Schmidt wrote: > >> Perhaps, wrapping wpa_supplicant invocation into "lockf -t0" would help > >> to eliminate race? > > > > Possibly, but I don't think this is the way to go. > > > > Currently wpa_supplicant has this code: > > /* > > > > * Mark the interface as down to ensure wpa_supplicant has > > exclusive * access to the net80211 state machine, do this > > before opening the * route socket to avoid a false event that > > the interface disappeared. */ > > > > if (getifflags(drv, &flags) == 0) > > > > (void) setifflags(drv, flags &~ IFF_UP); > > > > This code works such that it will send an event to already running > > wpa_supplicant instances which will then terminate. This does indeed work > > if there's enough delay between invocations, though, if there is just a > > small delay (~100ms or something), that event doesn't get passed > > probably. I think we should start looking into possible solution at that > > point, trying to figure out why the the event doesn't get passed > > (probably because the interface is not yet up at that point) will get us > > closer to proper solution. > > Proper fine-grained locking was always good solution for race problem :-) > How about using flock(2) in wpa_supplicant source code? I don't see any flock'able resource shared between instances, do you? -- Bernhard From: Eugene Grosbein To: bschmidt@freebsd.org Cc: bug-followup@freebsd.org, freebsd-net@freebsd.org Subject: Re: kern/153594: [iwn] Network keeps disconnecting when /etc/rc.d/netif restart is run Date: Tue, 04 Jan 2011 15:44:36 +0600 On 04.01.2011 15:39, Bernhard Schmidt wrote: >> Proper fine-grained locking was always good solution for race problem :-) >> How about using flock(2) in wpa_supplicant source code? > > I don't see any flock'able resource shared between instances, do you? Just use pidfile(3) :-) From: Bernhard Schmidt To: kubito@gmail.com Cc: bug-followup@freebsd.org Subject: Re: kern/153594: [iwn] Network keeps disconnecting when /etc/rc.d/netif restart is run Date: Mon, 17 Jan 2011 21:27:36 +0100 --Boundary-00=_4YKNNVp7zzG0j0K Content-Type: Text/Plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Hi, can you give attached patch a shot? Just apply it to /etc/devd.conf and restart devd. This should fix the issue with netif restart. Thanks. -- Bernhard --Boundary-00=_4YKNNVp7zzG0j0K Content-Type: text/x-patch; charset="ISO-8859-1"; name="devd-80211.diff" Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename="devd-80211.diff" Index: etc/devd.conf =================================================================== --- etc/devd.conf (revision 217018) +++ etc/devd.conf (working copy) @@ -60,14 +60,6 @@ notify 0 { # they have a different media type. We may want # to exploit this later. # -detach 0 { - media-type "802.11"; - action "/etc/pccard_ether $device-name stop"; -}; -attach 0 { - media-type "802.11"; - action "/etc/pccard_ether $device-name start"; -}; notify 0 { match "system" "IFNET"; match "type" "LINK_UP"; --Boundary-00=_4YKNNVp7zzG0j0K-- State-Changed-From-To: suspended->feedback State-Changed-By: bschmidt State-Changed-When: Tue Jan 18 10:23:32 UTC 2011 State-Changed-Why: feedback requested http://www.freebsd.org/cgi/query-pr.cgi?pr=153594 From: Raphael Kubo da Costa To: Bernhard Schmidt Cc: bug-followup@freebsd.org Subject: Re: kern/153594: [iwn] Network keeps disconnecting when /etc/rc.d/netif restart is run Date: Tue, 18 Jan 2011 22:41:32 -0200 On 01/17/2011 18:27, Bernhard Schmidt wrote: > Hi, > > can you give attached patch a shot? Just apply it to /etc/devd.conf and > restart devd. This should fix the issue with netif restart. > > Thanks. Hi, I applied the patch, then stopped devd and netif (in this order). After that, I started devd and netif (in this order). I did not lose packets when pinging a remote host, nor did I lose any after ~2 netif restarts. In the third time, I started losing more packets than before, and the problem persisted after another restart. I then stopped devd again, then stopped netif again, started both again and the problem disappeared. So it seems not to have completely vanished. Should I revert the patch? From: Bernhard Schmidt To: Raphael Kubo da Costa Cc: bug-followup@freebsd.org Subject: Re: kern/153594: [iwn] Network keeps disconnecting when /etc/rc.d/netif restart is run Date: Wed, 19 Jan 2011 08:14:32 +0100 On Wednesday, January 19, 2011 01:41:32 Raphael Kubo da Costa wrote: > On 01/17/2011 18:27, Bernhard Schmidt wrote: > > Hi, > > > > can you give attached patch a shot? Just apply it to /etc/devd.conf > > and restart devd. This should fix the issue with netif restart. > > > > Thanks. > > Hi, > > I applied the patch, then stopped devd and netif (in this order). > After that, I started devd and netif (in this order). > > I did not lose packets when pinging a remote host, nor did I lose any > after ~2 netif restarts. In the third time, I started losing more > packets than before, and the problem persisted after another restart. > > I then stopped devd again, then stopped netif again, started both > again and the problem disappeared. So it seems not to have > completely vanished. > > Should I revert the patch? While the 'packet loss' occurs, can you do a 'ps xauw | grep wpa'? if there aren't 2 instances of wpa_supplicant running, that's a new issue. -- Bernhard From: Raphael Kubo da Costa To: bschmidt@freebsd.org Cc: bug-followup@freebsd.org Subject: Re: kern/153594: [iwn] Network keeps disconnecting when /etc/rc.d/netif restart is run Date: Wed, 19 Jan 2011 22:40:22 -0200 On 01/19/2011 05:14, Bernhard Schmidt wrote: > On Wednesday, January 19, 2011 01:41:32 Raphael Kubo da Costa wrote: >> On 01/17/2011 18:27, Bernhard Schmidt wrote: >>> Hi, >>> >>> can you give attached patch a shot? Just apply it to /etc/devd.conf >>> and restart devd. This should fix the issue with netif restart. >>> >>> Thanks. >> >> Hi, >> >> I applied the patch, then stopped devd and netif (in this order). >> After that, I started devd and netif (in this order). >> >> I did not lose packets when pinging a remote host, nor did I lose any >> after ~2 netif restarts. In the third time, I started losing more >> packets than before, and the problem persisted after another restart. >> >> I then stopped devd again, then stopped netif again, started both >> again and the problem disappeared. So it seems not to have >> completely vanished. >> >> Should I revert the patch? > > While the 'packet loss' occurs, can you do a 'ps xauw | grep wpa'? if > there aren't 2 instances of wpa_supplicant running, that's a new issue. Indeed, there are 2 wpa_supplicant instances running when the packet losses occur. If I stop both devd and netif and start netif, I get one single wpa_supplicant instance and no packet loss. >Unformatted: