From mala@hinterbergen.de  Mon Dec 28 14:25:59 2009
Return-Path: <mala@hinterbergen.de>
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 0F52B10656C5
	for <FreeBSD-gnats-submit@freebsd.org>; Mon, 28 Dec 2009 14:25:59 +0000 (UTC)
	(envelope-from mala@hinterbergen.de)
Received: from moutng.kundenserver.de (moutng.kundenserver.de [212.227.17.9])
	by mx1.freebsd.org (Postfix) with ESMTP id 9F20D8FC19
	for <FreeBSD-gnats-submit@freebsd.org>; Mon, 28 Dec 2009 14:25:58 +0000 (UTC)
Received: from router.hinter.bergen2 (pD95489F0.dip0.t-ipconnect.de [217.84.137.240])
	by mrelayeu.kundenserver.de (node=mrbap1) with ESMTP (Nemesis)
	id 0MeMN5-1NcEDY1kqN-00PaWv; Mon, 28 Dec 2009 15:13:19 +0100
Received: by router.hinter.bergen2 (Postfix, from userid 1000)
	id 8D83B5C04; Mon, 28 Dec 2009 15:13:15 +0100 (CET)
Message-Id: <20091228141315.8D83B5C04@router.hinter.bergen2>
Date: Mon, 28 Dec 2009 15:13:15 +0100 (CET)
From: Manuel Schiller <mala@hinterbergen.de>
Reply-To: Manuel Schiller <mala@hinterbergen.de>
To: FreeBSD-gnats-submit@freebsd.org
Cc: Manuel Schiller <mala@hinterbergen.de>
Subject: FreeBSD 8.0 kernel panics on sparc64 when accessing NFS
X-Send-Pr-Version: 3.113
X-GNATS-Notify:

>Number:         142102
>Category:       sparc64
>Synopsis:       [nfs] [panic] FreeBSD 8.0 kernel panics on sparc64 when accessing NFS
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    freebsd-sparc64
>State:          closed
>Quarter:        
>Keywords:       
>Date-Required:  
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Mon Dec 28 14:30:01 UTC 2009
>Closed-Date:    Mon Dec 12 19:02:27 UTC 2011
>Last-Modified:  Mon Dec 12 19:02:27 UTC 2011
>Originator:     Manuel Schiller
>Release:        FreeBSD 8.0-RELEASE-p1 sparc64
>Organization:
(none)
>Environment:
System: FreeBSD router.hinter.bergen2 8.0-RELEASE-p1 FreeBSD 8.0-RELEASE-p1 #4: Mon Dec 28 11:53:32 CET 2009 root@router.hinter.bergen2:/usr/obj/usr/src/sys/FENCHURCH sparc64


	The machine is a SUN Netra T1 running FreeBSD 8.0. The kernel
	configuration is almost what ships as GENERIC, only a few unneeded NIC
	drivers have been removed, and KDB and DDB have been enabled.

>Description:
	The kernel panics due to an unaligned memory access in the NFS server
	code. I could trace the problem to the statement on line 209 of
	sys/nfsserver/nfs_fha.c which does not satify the alignment requirements
	which sparc64 imposes.

	The backtrace I get from my kernel is below (sorry, kgdb does not work
	properly on sparc64, so I have to use the in-kernel backtrace...):

	panic: trap: memory address not aligned
	cpuid = 0
	KDB: stack backtrace:
	panic() at panic+0x1c8
	trap() at trap+0x4d0
	-- memory address not aligned sfar=0xd6663424 sfsr=0x40029 %o7=0xc051b86c --
	fha_assign() at fha_assign+0x140
	svc_run_internal() at svc_run_internal+0x71c
	svc_thread_start() at svc_thread_start+0x8
	fork_exit() at fork_exit+0x80
	fork_trampoline() at fork_trampoline+0x8
	Uptime: 1m12s
	Dumping 512 MB (2 chunks)
	  chunk at 0: 268435456 bytes |\

	If you need anything or want something tested, please let me know.

>How-To-Repeat:
	Enable the NFS server options in /etc/rc.d on a sparc64 machine
	and try to mount any NFS share from a different machine.

>Fix:
	Replace the assignment on line 209 of sys/nfsserver/nfs_fha.c
	with a call to memcpy, i.e. something like:

	memcpy(&i->fh, &fh.fh_generic.fh_fid.fid_data,
			sizeof(const u_int64_t *));

	This is ugly, but will work for all architectures, regardless of
	alignment constraints imposed. This is likely to be a little
	slower than the original code, but I doubt the difference will be
	noticeable (we're talking about different ways to copy eight
	bytes, so the code would have to be called in a very tight loop
	for things to become noticeably slower).

	The issue with the original assignment is that the structs in
	question do not satisfy the alignment contstraints on sparc64
	for reading a pointer from fid_data. Instead of playing with
	the struct to get the alignment right, it seemed safer to me to
	use a memcpy here, since in future modifications of the struct,
	the alignment constraints are most likely not at the front of the
	mind of the person doing the modification.

	The change has been tested on sparc64 for half a day, and things
	seem to be running smoothly.
>Release-Note:
>Audit-Trail:

From: Manuel Tobias Schiller <mala@hinterbergen.de>
To: bug-followup@FreeBSD.org, mala@hinterbergen.de
Cc:  
Subject: Re: sparc64/142102: FreeBSD 8.0 kernel panics on sparc64 when
 accessing NFS
Date: Mon, 28 Dec 2009 16:40:04 +0100

 Dear all,
 
 the fix I gave in the original problem report appears to be incomplete;
 the kernel still crashes sometimes during NFS access due to improper
 alignment, this time in sys/nfsserver/nfs_srvsubs.c, line 1355. I will
 try to continue debugging the issue and report back with any fixes I
 manage to obtain.
 
 Best regards,
 
 Manuel
 
 -- 
 Homepage: http://www.hinterbergen.de/mala
 OpenPGP: 0xA330353E (DSA) or 0xD87D188C (RSA)

From: Marius Strobl <marius@alchemy.franken.de>
To: Manuel Tobias Schiller <mala@hinterbergen.de>
Cc: freebsd-sparc64@FreeBSD.org
Subject: Re: sparc64/142102: FreeBSD 8.0 kernel panics on sparc64 when accessing NFS
Date: Mon, 28 Dec 2009 17:28:17 +0100

 On Mon, Dec 28, 2009 at 04:00:09PM +0000, Manuel Tobias Schiller wrote:
 > The following reply was made to PR sparc64/142102; it has been noted by GNATS.
 > 
 > From: Manuel Tobias Schiller <mala@hinterbergen.de>
 > To: bug-followup@FreeBSD.org, mala@hinterbergen.de
 > Cc:  
 > Subject: Re: sparc64/142102: FreeBSD 8.0 kernel panics on sparc64 when
 >  accessing NFS
 > Date: Mon, 28 Dec 2009 16:40:04 +0100
 > 
 >  Dear all,
 >  
 >  the fix I gave in the original problem report appears to be incomplete;
 >  the kernel still crashes sometimes during NFS access due to improper
 >  alignment, this time in sys/nfsserver/nfs_srvsubs.c, line 1355. I will
 >  try to continue debugging the issue and report back with any fixes I
 >  manage to obtain.
 >  
 
 Note that the original PR is a dupe of PR 140797. Could the second
 panic you're seeing be a result of using an incorrect size parameter
 in your change?
 
 Marius
 

From: Marius Strobl <marius@alchemy.franken.de>
To: Manuel Tobias Schiller <mala@hinterbergen.de>
Cc: bug-followup@FreeBSD.org
Subject: Re: sparc64/142102: FreeBSD 8.0 kernel panics on sparc64 when accessing NFS
Date: Mon, 28 Dec 2009 19:26:47 +0100

 On Mon, Dec 28, 2009 at 06:07:36PM +0100, Manuel Tobias Schiller wrote:
 > On Mon, 28 Dec 2009 17:28:18 +0100
 > Marius Strobl <marius@alchemy.franken.de> wrote:
 > 
 > > On Mon, Dec 28, 2009 at 04:00:09PM +0000, Manuel Tobias Schiller wrote:
 > > > The following reply was made to PR sparc64/142102; it has been noted
 > > > by GNATS.
 > > > 
 > > > From: Manuel Tobias Schiller <mala@hinterbergen.de>
 > > > To: bug-followup@FreeBSD.org, mala@hinterbergen.de
 > > > Cc:  
 > > > Subject: Re: sparc64/142102: FreeBSD 8.0 kernel panics on sparc64 when
 > > >  accessing NFS
 > > > Date: Mon, 28 Dec 2009 16:40:04 +0100
 > > > 
 > > >  Dear all,
 > > >  
 > > >  the fix I gave in the original problem report appears to be
 > > > incomplete; the kernel still crashes sometimes during NFS access due
 > > > to improper alignment, this time in sys/nfsserver/nfs_srvsubs.c, line
 > > > 1355. I will try to continue debugging the issue and report back with
 > > > any fixes I manage to obtain.
 > > >  
 > > 
 > > Note that the original PR is a dupe of PR 140797. Could the second
 > > panic you're seeing be a result of using an incorrect size parameter
 > > in your change?
 > > 
 > > Marius
 > > 
 > > 
 > Dear Marius,
 > 
 > sorry, I searched, but I failed to spot PR 140797. I don't think that the
 > panic I'm seeing after my change has to do with the size parameter (if
 > it's incorrect though, please let me know). With the change I can read
 > files and do directory lookups just fine (tested with a 45 Gigabyte
 > file). I get the crash the minute I'm trying to write a file to the NFS
 > share.
 > 
 
 Yes, the size parameter in your patch is wrong. I also can't reproduce
 your remaining problem with the stock sources, I haven't checked whether
 it could be releated with using an incorrect size parameter though.
 
 Marius
 

From: Manuel Tobias Schiller <mala@hinterbergen.de>
To: Marius Strobl <marius@alchemy.franken.de>
Cc: bug-followup@FreeBSD.org, Manuel Tobias Schiller <mala@hinterbergen.de>
Subject: Re: sparc64/142102: FreeBSD 8.0 kernel panics on sparc64 when
 accessing NFS
Date: Mon, 28 Dec 2009 20:15:24 +0100

 On Mon, 28 Dec 2009 19:26:47 +0100
 Marius Strobl <marius@alchemy.franken.de> wrote:
 > > Dear Marius,
 > > 
 > > sorry, I searched, but I failed to spot PR 140797. I don't think that
 > > the panic I'm seeing after my change has to do with the size
 > > parameter (if it's incorrect though, please let me know). With the
 > > change I can read files and do directory lookups just fine (tested
 > > with a 45 Gigabyte file). I get the crash the minute I'm trying to
 > > write a file to the NFS share.
 > > 
 > 
 > Yes, the size parameter in your patch is wrong. I also can't reproduce
 > your remaining problem with the stock sources, I haven't checked whether
 > it could be releated with using an incorrect size parameter though.
 > 
 > Marius
 
 Dear Marius,
 
 looking a bit more closely, I also agree that the size parameter is
 incorrect. However, it should not make a difference on sparc64, since
 sizeof(u_int64_t *) = 8 = sizeof(u_int64_t).
 
 I started a new kernel build with the official fix in PR 140797 right
 after I wrote my last e-mail, and with any luck, it should finish in an
 hour or so. I'll let you know if I can reproduce the panic with the
 official fix.
 
 Thanks,
 
 Manuel
 
 -- 
 Homepage: http://www.hinterbergen.de/mala
 OpenPGP: 0xA330353E (DSA) or 0xD87D188C (RSA)

From: Mark Linimon <linimon@lonesome.com>
To: Manuel Schiller <mala@hinterbergen.de>
Cc: FreeBSD-gnats-submit@FreeBSD.org
Subject: Re: sparc64/142102: FreeBSD 8.0 kernel panics on sparc64 when
	accessing NFS
Date: Mon, 28 Dec 2009 14:26:50 -0600

 Just to add a data point, the Netra T1s in the package build cluster do
 not show this problem, so I'm guessing that you are hitting an edge
 on large file transfers.  We run them pretty hard.
 
 mcl

From: Manuel Tobias Schiller <mala@hinterbergen.de>
To: Mark Linimon <linimon@lonesome.com>
Cc: FreeBSD-gnats-submit@FreeBSD.org, Marius Strobl
 <marius@alchemy.franken.de>, Manuel Tobias Schiller <mala@hinterbergen.de>
Subject: Re: sparc64/142102: FreeBSD 8.0 kernel panics on sparc64 when
 accessing NFS
Date: Mon, 28 Dec 2009 22:06:02 +0100

 On Mon, 28 Dec 2009 14:26:50 -0600
 Mark Linimon <linimon@lonesome.com> wrote:
 
 > Just to add a data point, the Netra T1s in the package build cluster do
 > not show this problem, so I'm guessing that you are hitting an edge
 > on large file transfers.  We run them pretty hard.
 > 
 > mcl
 > 
 Hi Mark, Marius,
 
 thanks for the data point. In the meantime, I managed to test a kernel
 with the "official" fix from PR 140797, and I still get the crash when
 trying to write a moderately sized file (20 Megabytes) to NFS. According
 to the backtrace below, it crashes in line 1355 of
 /usr/src/sys/nfsserver/nfs_srvsubs.c.
 
 Since I'm using the fix from the original problem report and RELENG_8_0
 sources cvsupped on December 20, I guess that means we are either using
 different source trees (i.e. there is something in the sources for the
 kernels used on your machines that helps), or there is some other
 difference which I have not been able to pinpoint yet. Maybe you could
 clarify before I try hacking away or finding other differences...
 
 Thanks,
 
 Manuel
 
 
 ---- backtrack of panic follows ----
 panic: trap: memory address not aligned
 cpuid = 0
 KDB: stack backtrace:
 panic() at panic+0x1c8
 trap() at trap+0x4d0
 -- memory address not aligned sfar=0xfffff800016d08aa sfsr=0x40029
 %o7=0xc0528934 -- nfsm_srvmtofh_xx() at nfsm_srvmtofh_xx+0x24
 fha_assign() at fha_assign+0x12c
 svc_run_internal() at svc_run_internal+0x71c
 svc_thread_start() at svc_thread_start+0x8
 fork_exit() at fork_exit+0x80
 fork_trampoline() at fork_trampoline+0x8
 Uptime: 2m1s
 Dumping 512 MB (2 chunks)
   chunk at 0: 268435456 bytes |
 
 -- 
 Homepage: http://www.hinterbergen.de/mala
 OpenPGP: 0xA330353E (DSA) or 0xD87D188C (RSA)

From: Mark Linimon <linimon@lonesome.com>
To: Manuel Tobias Schiller <mala@hinterbergen.de>
Cc: FreeBSD-gnats-submit@FreeBSD.org,
	Marius Strobl <marius@alchemy.franken.de>
Subject: Re: sparc64/142102: FreeBSD 8.0 kernel panics on sparc64 when
	accessing NFS
Date: Mon, 28 Dec 2009 15:18:09 -0600

 On Mon, Dec 28, 2009 at 10:06:02PM +0100, Manuel Tobias Schiller wrote:
 > Since I'm using the fix from the original problem report and RELENG_8_0
 > sources cvsupped on December 20, I guess that means we are either using
 > different source trees (i.e. there is something in the sources for the
 > kernels used on your machines that helps), or there is some other
 > difference which I have not been able to pinpoint yet.
 
 It turns out the kernels on the T1s is much older than I thought.
 I know I have been using something pretty close to 8.0R on one of
 my machines at home but I am traveling right now and accessing it
 would be kind of a pain.
 
 All our other sparcs don't use NFS to the same extent the above do
 (they are pxebooted and e.g. /usr will go across NFS.)
 
 mcl

From: Manuel Tobias Schiller <mala@hinterbergen.de>
To: Mark Linimon <linimon@lonesome.com>
Cc: FreeBSD-gnats-submit@FreeBSD.org, Marius Strobl
 <marius@alchemy.franken.de>, Manuel Tobias Schiller <mala@hinterbergen.de>
Subject: Re: sparc64/142102: FreeBSD 8.0 kernel panics on sparc64 when
 accessing NFS
Date: Mon, 28 Dec 2009 22:40:47 +0100

 On Mon, 28 Dec 2009 15:18:09 -0600
 Mark Linimon <linimon@lonesome.com> wrote:
 
 > It turns out the kernels on the T1s is much older than I thought.
 > I know I have been using something pretty close to 8.0R on one of
 > my machines at home but I am traveling right now and accessing it
 > would be kind of a pain.
 > 
 > All our other sparcs don't use NFS to the same extent the above do
 > (they are pxebooted and e.g. /usr will go across NFS.)
 > 
 > mcl
 > 
 
 Hi Mark,
 
 thanks for the clarification. With the situation being what you describe,
 I think I'll try upgrading my sources to the latest RELENG_8_0 ones to be
 on the safe side, check if the problem persists, and if so, I'll try to
 find a solution (unless I hear from someone to try something different).
 
 Another question springs to mind: If /usr goes across NFS, most traffic
 arriving at the NFS server should be reading, right? Because that appears
 to work without a glitch with the fix (at least on my machine). Right now,
 the panic is pushed back to the point when I try to write (i.e. copy a
 new file to the NFS server).
 
 Thanks for the quick reply,
 
 Manuel
 
 -- 
 Homepage: http://www.hinterbergen.de/mala
 OpenPGP: 0xA330353E (DSA) or 0xD87D188C (RSA)

From: Mark Linimon <linimon@lonesome.com>
To: Manuel Tobias Schiller <mala@hinterbergen.de>
Cc: FreeBSD-gnats-submit@FreeBSD.org,
	Marius Strobl <marius@alchemy.franken.de>
Subject: Re: sparc64/142102: FreeBSD 8.0 kernel panics on sparc64 when
	accessing NFS
Date: Mon, 28 Dec 2009 15:44:21 -0600

 On Mon, Dec 28, 2009 at 10:40:47PM +0100, Manuel Tobias Schiller wrote:
 > Another question springs to mind: If /usr goes across NFS, most traffic
 > arriving at the NFS server should be reading, right?
 
 Yes, as a matter of fact, all the traffic on the Netras is r/o.
 
 However, on the 280R that I have at home, I updated an entire ports
 tree over an NFS-mounted /usr mounted r/w.  That's the machine I'll
 check in a few days.
 
 mcl

From: Marius Strobl <marius@alchemy.franken.de>
To: Manuel Tobias Schiller <mala@hinterbergen.de>
Cc: Mark Linimon <linimon@lonesome.com>, FreeBSD-gnats-submit@FreeBSD.org
Subject: Re: sparc64/142102: FreeBSD 8.0 kernel panics on sparc64 when accessing NFS
Date: Tue, 29 Dec 2009 01:58:29 +0100

 On Mon, Dec 28, 2009 at 10:06:02PM +0100, Manuel Tobias Schiller wrote:
 > On Mon, 28 Dec 2009 14:26:50 -0600
 > Mark Linimon <linimon@lonesome.com> wrote:
 > 
 > > Just to add a data point, the Netra T1s in the package build cluster do
 > > not show this problem, so I'm guessing that you are hitting an edge
 > > on large file transfers.  We run them pretty hard.
 
 I think you guys are talking about different things; AFAIK the
 package build cluster nodes only act as NFS clients but this
 problem is about when using machines with strict alignment
 requirements as an NFS server.
 
 > > 
 > > mcl
 > > 
 > Hi Mark, Marius,
 > 
 > thanks for the data point. In the meantime, I managed to test a kernel
 > with the "official" fix from PR 140797, and I still get the crash when
 > trying to write a moderately sized file (20 Megabytes) to NFS. According
 > to the backtrace below, it crashes in line 1355 of
 > /usr/src/sys/nfsserver/nfs_srvsubs.c.
 > 
 > Since I'm using the fix from the original problem report and RELENG_8_0
 > sources cvsupped on December 20, I guess that means we are either using
 > different source trees (i.e. there is something in the sources for the
 > kernels used on your machines that helps), or there is some other
 > difference which I have not been able to pinpoint yet. Maybe you could
 > clarify before I try hacking away or finding other differences...
 > 
 > Thanks,
 > 
 > Manuel
 > 
 > 
 > ---- backtrack of panic follows ----
 > panic: trap: memory address not aligned
 > cpuid = 0
 > KDB: stack backtrace:
 > panic() at panic+0x1c8
 > trap() at trap+0x4d0
 > -- memory address not aligned sfar=0xfffff800016d08aa sfsr=0x40029
 > %o7=0xc0528934 -- nfsm_srvmtofh_xx() at nfsm_srvmtofh_xx+0x24
 > fha_assign() at fha_assign+0x12c
 > svc_run_internal() at svc_run_internal+0x71c
 > svc_thread_start() at svc_thread_start+0x8
 > fork_exit() at fork_exit+0x80
 > fork_trampoline() at fork_trampoline+0x8
 > Uptime: 2m1s
 > Dumping 512 MB (2 chunks)
 >   chunk at 0: 268435456 bytes |
 > 
 
 I'm using a more or less current HEAD but the NFS code hasn't
 changed that much since 8.0, at least it doesn't contain any
 other alignment fixes I'm aware of.
 I think I got what the problem is but I still haven't managed
 to reproduce it so far. Could you please test whether the
 following patch makes a difference?
 http://people.freebsd.org/~marius/fha_extract_info_realign.diff
 
 Marius
 

From: Manuel Tobias Schiller <mala@hinterbergen.de>
To: Marius Strobl <marius@alchemy.franken.de>
Cc: Mark Linimon <linimon@lonesome.com>, FreeBSD-gnats-submit@FreeBSD.org,
 Manuel Tobias Schiller <mala@hinterbergen.de>
Subject: Re: sparc64/142102: FreeBSD 8.0 kernel panics on sparc64 when
 accessing NFS
Date: Tue, 29 Dec 2009 11:15:16 +0100

 On Tue, 29 Dec 2009 01:58:29 +0100
 Marius Strobl <marius@alchemy.franken.de> wrote:
 
 > I think you guys are talking about different things; AFAIK the
 > package build cluster nodes only act as NFS clients but this
 > problem is about when using machines with strict alignment
 > requirements as an NFS server.
 
 That might explain why Mark does not see panics (assuming the NFS
 client code has had more testing on sparc64 or similar architectures
 which does not seem unlikely (e.g. in the build cluster))...
 
 > I'm using a more or less current HEAD but the NFS code hasn't
 > changed that much since 8.0, at least it doesn't contain any
 > other alignment fixes I'm aware of.
 > I think I got what the problem is but I still haven't managed
 > to reproduce it so far. Could you please test whether the
 > following patch makes a difference?
 > http://people.freebsd.org/~marius/fha_extract_info_realign.diff
 > 
 > Marius
 
 Your patch applies more or less cleanly (a line or two offset here and
 there, but nothing to worry about AFAICT), and make buildkernel is having
 a go at the sources. I'll let you know what comes out.
 
 Thanks,
 
 Manuel
 
 
 -- 
 Homepage: http://www.hinterbergen.de/mala
 OpenPGP: 0xA330353E (DSA) or 0xD87D188C (RSA)

From: Manuel Tobias Schiller <mala@hinterbergen.de>
To: Marius Strobl <marius@alchemy.franken.de>
Cc: Mark Linimon <linimon@lonesome.com>, FreeBSD-gnats-submit@FreeBSD.org
Subject: Re: sparc64/142102: FreeBSD 8.0 kernel panics on sparc64 when
 accessing NFS
Date: Tue, 29 Dec 2009 16:34:11 +0100

 On Tue, 29 Dec 2009 01:58:29 +0100
 Marius Strobl <marius@alchemy.franken.de> wrote:
 > I'm using a more or less current HEAD but the NFS code hasn't
 > changed that much since 8.0, at least it doesn't contain any
 > other alignment fixes I'm aware of.
 > I think I got what the problem is but I still haven't managed
 > to reproduce it so far. Could you please test whether the
 > following patch makes a difference?
 > http://people.freebsd.org/~marius/fha_extract_info_realign.diff
 > 
 > Marius
 
 I applied the patch, compiled the kernel and rebooted. NFS read access
 seems to work as before. When I try to write to the server, the server
 locks up solid, i.e. I do not even get a kernel panic reported on the
 serial console, so I can't give you a backtrace or anything. I had to
 disconnect the cable from the wall outlet to the power supply to restart
 it. It came up without problems, though.
 
 Manuel
 
 
 -- 
 Homepage: http://www.hinterbergen.de/mala
 OpenPGP: 0xA330353E (DSA) or 0xD87D188C (RSA)

From: Marius Strobl <marius@alchemy.franken.de>
To: Manuel Tobias Schiller <mala@hinterbergen.de>
Cc: Mark Linimon <linimon@lonesome.com>, FreeBSD-gnats-submit@FreeBSD.org
Subject: Re: sparc64/142102: FreeBSD 8.0 kernel panics on sparc64 when accessing NFS
Date: Tue, 29 Dec 2009 19:53:58 +0100

 On Tue, Dec 29, 2009 at 04:34:11PM +0100, Manuel Tobias Schiller wrote:
 > On Tue, 29 Dec 2009 01:58:29 +0100
 > Marius Strobl <marius@alchemy.franken.de> wrote:
 > > I'm using a more or less current HEAD but the NFS code hasn't
 > > changed that much since 8.0, at least it doesn't contain any
 > > other alignment fixes I'm aware of.
 > > I think I got what the problem is but I still haven't managed
 > > to reproduce it so far. Could you please test whether the
 > > following patch makes a difference?
 > > http://people.freebsd.org/~marius/fha_extract_info_realign.diff
 > > 
 > > Marius
 > 
 > I applied the patch, compiled the kernel and rebooted. NFS read access
 > seems to work as before. When I try to write to the server, the server
 > locks up solid, i.e. I do not even get a kernel panic reported on the
 > serial console, so I can't give you a backtrace or anything. I had to
 > disconnect the cable from the wall outlet to the power supply to restart
 > it. It came up without problems, though.
 > 
 
 Oh, sorry, I had a bug in there, the svc code used a stale
 pointer to the mbuf with this. Could you please re-fetch
 and try again? I can't guarantee that this fixes the hang
 you experienced but at least in theory just re-aligning
 the data can't make the issue with the unaligned access
 any worse. Using the LOM or by breaking into the kernel
 by sending a break you should be able to power-cycle/
 reboot the machine without removing the power cord should
 it hang again though.
 What NFS client and with which mount options are you
 using to trigger is problem?
 
 Marius
 

From: Manuel Tobias Schiller <mala@hinterbergen.de>
To: Marius Strobl <marius@alchemy.franken.de>
Cc: Mark Linimon <linimon@lonesome.com>, FreeBSD-gnats-submit@FreeBSD.org
Subject: Re: sparc64/142102: FreeBSD 8.0 kernel panics on sparc64 when
 accessing NFS
Date: Tue, 29 Dec 2009 20:29:55 +0100

 On Tue, 29 Dec 2009 19:53:58 +0100
 Marius Strobl <marius@alchemy.franken.de> wrote:
 > Oh, sorry, I had a bug in there, the svc code used a stale
 > pointer to the mbuf with this. Could you please re-fetch
 > and try again? I can't guarantee that this fixes the hang
 > you experienced but at least in theory just re-aligning
 > the data can't make the issue with the unaligned access
 > any worse.
 
 No problem, I know what it's like if you try to come up with a fix
 without being able to check things for yourself... I think that I can
 start a rebuild in about two hours or so, so we should have results by
 tomorrow afternoon (tomorrow morning is reserved for my grandparents...).
 At the moment, I'm compiling a "dumb" variant which just uses bcopy for
 all operations in sys/nfs/xdr_subs.h - if that compiles and works (I
 tend to make mistakes when I code, and a make buildkernel takes quite
 some time), we should have a safe version to fall back on, no matter what
 alignment constraints the architecture imposes... It's going to be ugly,
 though, so I'd rather use your solution, if possible. I guess we just
 have to wait and see.
 
 > Using the LOM or by breaking into the kernel
 > by sending a break you should be able to power-cycle/
 > reboot the machine without removing the power cord should
 > it hang again though.
 
 I know about the sending-a-break-over-serial-trick, and of course I tried
 it, but there was no reaction whatsoever.
 
 > What NFS client and with which mount options are you
 > using to trigger is problem?
 > 
 > Marius
 
 Concerning the NFS client machines: They are running Debian lenny, with
 a 2.6.30-bpo.2-amd64 kernel (one client is a powerpc G4, the other one
 is an amd64 machine). Mount options on the Linux clients are
 rw,nosuid,nodev,hard,intr. This combination used to work fine and
 rock-stable with FreeBSD 7.2 (during the month I spent at CERN just
 before Christmas, the machine was up the entire time and never showed
 any sign of instability - admittedly, there was no NFS access during
 that time). I hope we manage to get the FreeBSD 8.0 kernel there as
 well - apart from the NFS issue, what I've seen so far is impressive, as
 usual...
 
 Manuel
 
 -- 
 Homepage: http://www.hinterbergen.de/mala
 OpenPGP: 0xA330353E (DSA) or 0xD87D188C (RSA)

From: Manuel Tobias Schiller <mala@hinterbergen.de>
To: Marius Strobl <marius@alchemy.franken.de>
Cc: Mark Linimon <linimon@lonesome.com>, FreeBSD-gnats-submit@FreeBSD.org,
 Manuel Tobias Schiller <mala@hinterbergen.de>
Subject: Re: sparc64/142102: FreeBSD 8.0 kernel panics on sparc64 when
 accessing NFS
Date: Wed, 30 Dec 2009 01:25:55 +0100

 On Tue, 29 Dec 2009 19:53:58 +0100
 Marius Strobl <marius@alchemy.franken.de> wrote:
 > Oh, sorry, I had a bug in there, the svc code used a stale
 > pointer to the mbuf with this. Could you please re-fetch
 > and try again? I can't guarantee that this fixes the hang
 > you experienced but at least in theory just re-aligning
 > the data can't make the issue with the unaligned access
 > any worse. Using the LOM or by breaking into the kernel
 > by sending a break you should be able to power-cycle/
 > reboot the machine without removing the power cord should
 > it hang again though.
 > What NFS client and with which mount options are you
 > using to trigger is problem?
 > 
 > Marius
 
 I managed to compile and test a kernel with your updated patch, and things
 seem to work. I'll try to stress-test it over the night by copying a few
 tens of gigabytes over and report back with the results. Also, if I'm
 reading your patch correctly, we should be able to do without the bcopy
 hack from PR 140797, so I'll recompile a kernel without it and try if that
 works as well.
 
 Thanks a lot for all your help and your patience!
 
 Manuel
 
 -- 
 Homepage: http://www.hinterbergen.de/mala
 OpenPGP: 0xA330353E (DSA) or 0xD87D188C (RSA)

From: Marius Strobl <marius@alchemy.franken.de>
To: Manuel Tobias Schiller <mala@hinterbergen.de>
Cc: Mark Linimon <linimon@lonesome.com>, FreeBSD-gnats-submit@FreeBSD.org
Subject: Re: sparc64/142102: FreeBSD 8.0 kernel panics on sparc64 when accessing NFS
Date: Wed, 30 Dec 2009 02:08:01 +0100

 On Wed, Dec 30, 2009 at 01:25:55AM +0100, Manuel Tobias Schiller wrote:
 > On Tue, 29 Dec 2009 19:53:58 +0100
 > Marius Strobl <marius@alchemy.franken.de> wrote:
 > > Oh, sorry, I had a bug in there, the svc code used a stale
 > > pointer to the mbuf with this. Could you please re-fetch
 > > and try again? I can't guarantee that this fixes the hang
 > > you experienced but at least in theory just re-aligning
 > > the data can't make the issue with the unaligned access
 > > any worse. Using the LOM or by breaking into the kernel
 > > by sending a break you should be able to power-cycle/
 > > reboot the machine without removing the power cord should
 > > it hang again though.
 > > What NFS client and with which mount options are you
 > > using to trigger is problem?
 > > 
 > > Marius
 > 
 > I managed to compile and test a kernel with your updated patch, and things
 > seem to work. I'll try to stress-test it over the night by copying a few
 > tens of gigabytes over and report back with the results. Also, if I'm
 > reading your patch correctly, we should be able to do without the bcopy
 > hack from PR 140797, so I'll recompile a kernel without it and try if that
 > works as well.
 
 I don't think so; nfs_realign() only guarantees 4-byte alignment
 as required by XDR and assumed by nfsm_srvmtofh_xx() further down
 the road but nfsfh_t needs 8-byte alignment.
 
 > 
 > Thanks a lot for all your help and your patience!
 > 
 
 Well, thanks a lot for testing so far!
 
 Marius
 

From: Marius Strobl <marius@alchemy.franken.de>
To: dfr@freebsd.org
Cc: Manuel Tobias Schiller <mala@hinterbergen.de>,
        FreeBSD-gnats-submit@freebsd.org
Subject: Re: sparc64/142102: FreeBSD 8.0 kernel panics on sparc64 when accessing NFS
Date: Wed, 30 Dec 2009 17:36:27 +0100

 On Wed, Dec 30, 2009 at 01:25:55AM +0100, Manuel Tobias Schiller wrote:
 > On Tue, 29 Dec 2009 19:53:58 +0100
 > Marius Strobl <marius@alchemy.franken.de> wrote:
 > > Oh, sorry, I had a bug in there, the svc code used a stale
 > > pointer to the mbuf with this. Could you please re-fetch
 > > and try again? I can't guarantee that this fixes the hang
 > > you experienced but at least in theory just re-aligning
 > > the data can't make the issue with the unaligned access
 > > any worse. Using the LOM or by breaking into the kernel
 > > by sending a break you should be able to power-cycle/
 > > reboot the machine without removing the power cord should
 > > it hang again though.
 > > What NFS client and with which mount options are you
 > > using to trigger is problem?
 > > 
 > > Marius
 > 
 > I managed to compile and test a kernel with your updated patch, and things
 > seem to work. I'll try to stress-test it over the night by copying a few
 > tens of gigabytes over and report back with the results.
 
 Doug,
 
 could you please review the following patch? The problem
 apparently is that nfsm_srvmtofh_xx() assumes the 4-byte
 alignment required by XDR so fha_extract_info() has to
 ensure that the mbuf data is aligned accordingly.
 http://people.freebsd.org/~marius/fha_extract_info_realign.diff
 
 Thanks,
 Marius
 

From: Manuel Tobias Schiller <mala@hinterbergen.de>
To: Marius Strobl <marius@alchemy.franken.de>
Cc: dfr@freebsd.org, FreeBSD-gnats-submit@freebsd.org
Subject: Re: sparc64/142102: FreeBSD 8.0 kernel panics on sparc64 when
 accessing NFS
Date: Wed, 30 Dec 2009 20:32:55 +0100

 On Wed, 30 Dec 2009 17:36:27 +0100
 Marius Strobl <marius@alchemy.franken.de> wrote:
 
 > On Wed, Dec 30, 2009 at 01:25:55AM +0100, Manuel Tobias Schiller wrote:
 > > On Tue, 29 Dec 2009 19:53:58 +0100
 > > Marius Strobl <marius@alchemy.franken.de> wrote:
 > > > Oh, sorry, I had a bug in there, the svc code used a stale
 > > > pointer to the mbuf with this. Could you please re-fetch
 > > > and try again? I can't guarantee that this fixes the hang
 > > > you experienced but at least in theory just re-aligning
 > > > the data can't make the issue with the unaligned access
 > > > any worse. Using the LOM or by breaking into the kernel
 > > > by sending a break you should be able to power-cycle/
 > > > reboot the machine without removing the power cord should
 > > > it hang again though.
 > > > What NFS client and with which mount options are you
 > > > using to trigger is problem?
 > > > 
 > > > Marius
 > > 
 > > I managed to compile and test a kernel with your updated patch, and
 > > things seem to work. I'll try to stress-test it over the night by
 > > copying a few tens of gigabytes over and report back with the results.
 > 
 > Doug,
 > 
 > could you please review the following patch? The problem
 > apparently is that nfsm_srvmtofh_xx() assumes the 4-byte
 > alignment required by XDR so fha_extract_info() has to
 > ensure that the mbuf data is aligned accordingly.
 > http://people.freebsd.org/~marius/fha_extract_info_realign.diff
 > 
 > Thanks,
 > Marius
 
 Marius,
 
 I did a stress-test of the patch by tarring 85 Gigabyte over NFS into a
 pipe, untarring them in a different NFS-mounted directory. I loaded the
 server machine by compiling a kernel in parallel for some time.
 Everything worked perfectly.
 
 Thanks for all your help!
 
 Manuel
 
 -- 
 Homepage: http://www.hinterbergen.de/mala
 OpenPGP: 0xA330353E (DSA) or 0xD87D188C (RSA)

From: Manuel Tobias Schiller <mala@hinterbergen.de>
To: Marius Strobl <marius@alchemy.franken.de>
Cc: FreeBSD-gnats-submit@FreeBSD.org
Subject: Re: sparc64/142102: FreeBSD 8.0 kernel panics on sparc64 when
 accessing NFS
Date: Wed, 30 Dec 2009 20:36:31 +0100

 On Wed, 30 Dec 2009 02:08:01 +0100
 Marius Strobl <marius@alchemy.franken.de> wrote:
 
 > On Wed, Dec 30, 2009 at 01:25:55AM +0100, Manuel Tobias Schiller wrote:
 >
 > > I managed to compile and test a kernel with your updated patch, and
 > > things seem to work. I'll try to stress-test it over the night by
 > > copying a few tens of gigabytes over and report back with the
 > > results. Also, if I'm reading your patch correctly, we should be able
 > > to do without the bcopy hack from PR 140797, so I'll recompile a
 > > kernel without it and try if that works as well.
 > 
 > I don't think so; nfs_realign() only guarantees 4-byte alignment
 > as required by XDR and assumed by nfsm_srvmtofh_xx() further down
 > the road but nfsfh_t needs 8-byte alignment.
 
 Of course, you are right. When I tried it, I got my well deserved panic
 again. I should have looked at the code of nfs_realign much more
 carefully...
 
 Anyway, thanks again!
 
 Manuel
 
 -- 
 Homepage: http://www.hinterbergen.de/mala
 OpenPGP: 0xA330353E (DSA) or 0xD87D188C (RSA)

From: dfilter@FreeBSD.ORG (dfilter service)
To: bug-followup@FreeBSD.org
Cc:  
Subject: Re: sparc64/142102: commit references a PR
Date: Sat,  9 Jan 2010 15:31:43 +0000 (UTC)

 Author: marius
 Date: Sat Jan  9 15:31:27 2010
 New Revision: 201896
 URL: http://svn.freebsd.org/changeset/base/201896
 
 Log:
   Exclude options COMPAT_FREEBSD4 now that the MD freebsd4_sigreturn()
   is gone since r201396 and which is also in line with the fact that
   FreeBSD 4 didn't supported sparc64.
   
   PR:		142102 (second part)
   MFC after:	1 week
 
 Modified:
   head/sys/nfsserver/nfs.h
   head/sys/nfsserver/nfs_fha.c
   head/sys/nfsserver/nfs_srvkrpc.c
 
 Modified: head/sys/nfsserver/nfs.h
 ==============================================================================
 --- head/sys/nfsserver/nfs.h	Sat Jan  9 14:56:38 2010	(r201895)
 +++ head/sys/nfsserver/nfs.h	Sat Jan  9 15:31:27 2010	(r201896)
 @@ -240,6 +240,7 @@ extern int nfs_debug;
  
  #endif
  
 +void	nfs_realign(struct mbuf **);
  struct mbuf *nfs_rephead(int, struct nfsrv_descript *, int, struct mbuf **,
  	    caddr_t *);
  void	nfsm_srvfattr(struct nfsrv_descript *, struct vattr *,
 
 Modified: head/sys/nfsserver/nfs_fha.c
 ==============================================================================
 --- head/sys/nfsserver/nfs_fha.c	Sat Jan  9 14:56:38 2010	(r201895)
 +++ head/sys/nfsserver/nfs_fha.c	Sat Jan  9 15:31:27 2010	(r201896)
 @@ -158,9 +158,9 @@ SYSUNINIT(nfs_fha, SI_SUB_ROOT_CONF, SI_
  static void
  fha_extract_info(struct svc_req *req, struct fha_info *i)
  {
 -	struct mbuf *md = req->rq_args;
 +	struct mbuf *md;
  	nfsfh_t fh;
 -	caddr_t dpos = mtod(md, caddr_t);
 +	caddr_t dpos;
  	static u_int64_t random_fh = 0;
  	int error;
  	int v3 = (req->rq_vers == 3);
 @@ -201,6 +201,10 @@ fha_extract_info(struct svc_req *req, st
  	    procnum == NFSPROC_NULL)
  		goto out;
  	
 +	nfs_realign(&req->rq_args);
 +	md = req->rq_args;
 +	dpos = mtod(md, caddr_t);
 +
  	/* Grab the filehandle. */
  	error = nfsm_srvmtofh_xx(&fh.fh_generic, v3, &md, &dpos);
  	if (error)
 
 Modified: head/sys/nfsserver/nfs_srvkrpc.c
 ==============================================================================
 --- head/sys/nfsserver/nfs_srvkrpc.c	Sat Jan  9 14:56:38 2010	(r201895)
 +++ head/sys/nfsserver/nfs_srvkrpc.c	Sat Jan  9 15:31:27 2010	(r201896)
 @@ -266,7 +266,7 @@ nfs_rephead(int siz, struct nfsrv_descri
   *	not occur with NFS/UDP and is supposed to only occassionally occur
   *	with TCP.  Use vfs.nfs.realign_count and realign_test to check this.
   */
 -static void
 +void
  nfs_realign(struct mbuf **pm)	/* XXX COMMON */
  {
  	struct mbuf *m;
 _______________________________________________
 svn-src-all@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/svn-src-all
 To unsubscribe, send any mail to "svn-src-all-unsubscribe@freebsd.org"
 

From: dfilter@FreeBSD.ORG (dfilter service)
To: bug-followup@FreeBSD.org
Cc:  
Subject: Re: sparc64/142102: commit references a PR
Date: Sat,  9 Jan 2010 15:59:24 +0000 (UTC)

 Author: marius
 Date: Sat Jan  9 15:59:15 2010
 New Revision: 201899
 URL: http://svn.freebsd.org/changeset/base/201899
 
 Log:
   Some style(9) fixes in order to fabricate a commit to denote that
   the commit message for r201896 actually should have read:
   
   As nfsm_srvmtofh_xx() assumes the 4-byte alignment required by XDR
   ensure the mbuf data is aligned accordingly by calling nfs_realign()
   in fha_extract_info(). This fix is orthogonal to the problem solved
   by r199274/r199284.
   
   PR:		142102 (second part)
   MFC after:	1 week
 
 Modified:
   head/sys/nfsserver/nfs.h
   head/sys/nfsserver/nfs_fha.c
   head/sys/nfsserver/nfs_srvkrpc.c
 
 Modified: head/sys/nfsserver/nfs.h
 ==============================================================================
 --- head/sys/nfsserver/nfs.h	Sat Jan  9 15:43:47 2010	(r201898)
 +++ head/sys/nfsserver/nfs.h	Sat Jan  9 15:59:15 2010	(r201899)
 @@ -82,14 +82,13 @@
  #define IO_METASYNC	0
  #endif
  
 -
  /* NFS state flags XXX -Wunused */
  #define	NFSRV_SNDLOCK		0x01000000  /* Send socket lock */
  #define	NFSRV_WANTSND		0x02000000  /* Want above */
  
  /*
 - * Structures for the nfssvc(2) syscall. Not that anyone but nfsd and mount_nfs
 - * should ever try and use it.
 + * Structures for the nfssvc(2) syscall.  Not that anyone but nfsd and
 + * mount_nfs should ever try and use it.
   */
  
  /*
 
 Modified: head/sys/nfsserver/nfs_fha.c
 ==============================================================================
 --- head/sys/nfsserver/nfs_fha.c	Sat Jan  9 15:43:47 2010	(r201898)
 +++ head/sys/nfsserver/nfs_fha.c	Sat Jan  9 15:59:15 2010	(r201899)
 @@ -71,16 +71,17 @@ static struct fha_global {
  	u_long hashmask;
  } g_fha;
  
 -/* 
 - * These are the entries in the filehandle hash. They talk about a specific 
 - * file, requests against which are being handled by one or more nfsds. We keep
 - * a chain of nfsds against the file. We only have more than one if reads are 
 - * ongoing, and then only if the reads affect disparate regions of the file.
 +/*
 + * These are the entries in the filehandle hash.  They talk about a specific
 + * file, requests against which are being handled by one or more nfsds.  We
 + * keep a chain of nfsds against the file. We only have more than one if reads
 + * are ongoing, and then only if the reads affect disparate regions of the
 + * file.
   *
 - * In general, we want to assign a new request to an existing nfsd if it is 
 - * going to contend with work happening already on that nfsd, or if the 
 - * operation is a read and the nfsd is already handling a proximate read. We 
 - * do this to avoid jumping around in the read stream unnecessarily, and to 
 + * In general, we want to assign a new request to an existing nfsd if it is
 + * going to contend with work happening already on that nfsd, or if the
 + * operation is a read and the nfsd is already handling a proximate read.  We
 + * do this to avoid jumping around in the read stream unnecessarily, and to
   * avoid contention between threads over single files.
   */
  struct fha_hash_entry {
 @@ -101,7 +102,7 @@ struct fha_info {
  };
  
  static int fhe_stats_sysctl(SYSCTL_HANDLER_ARGS);
 - 
 +
  static void
  nfs_fha_init(void *foo)
  {
 @@ -136,7 +137,7 @@ nfs_fha_init(void *foo)
  	    &fha_ctls.max_reqs_per_nfsd, 0, "Maximum requests that "
  	    "single nfsd thread should be working on at any time");
  
 -	SYSCTL_ADD_OID(&fha_clist, SYSCTL_STATIC_CHILDREN(_vfs_nfsrv_fha), 
 +	SYSCTL_ADD_OID(&fha_clist, SYSCTL_STATIC_CHILDREN(_vfs_nfsrv_fha),
  	    OID_AUTO, "fhe_stats", CTLTYPE_STRING | CTLFLAG_RD, 0, 0,
  	    fhe_stats_sysctl, "A", "");
  }
 @@ -151,7 +152,7 @@ nfs_fha_uninit(void *foo)
  SYSINIT(nfs_fha, SI_SUB_ROOT_CONF, SI_ORDER_ANY, nfs_fha_init, NULL);
  SYSUNINIT(nfs_fha, SI_SUB_ROOT_CONF, SI_ORDER_ANY, nfs_fha_uninit, NULL);
  
 -/* 
 +/*
   * This just specifies that offsets should obey affinity when within
   * the same 1Mbyte (1<<20) chunk for the file (reads only for now).
   */
 @@ -167,18 +168,18 @@ fha_extract_info(struct svc_req *req, st
  	u_int32_t *tl;
  	rpcproc_t procnum;
  
 -	/* 
 -	 * We start off with a random fh. If we get a reasonable
 -	 * procnum, we set the fh. If there's a concept of offset 
 +	/*
 +	 * We start off with a random fh.  If we get a reasonable
 +	 * procnum, we set the fh.  If there's a concept of offset
  	 * that we're interested in, we set that.
  	 */
  	i->fh = ++random_fh;
  	i->offset = 0;
  	i->locktype = LK_EXCLUSIVE;
 -	
 +
  	/*
  	 * Extract the procnum and convert to v3 form if necessary,
 -	 * taking care to deal with out-of-range procnums. Caller will
 +	 * taking care to deal with out-of-range procnums.  Caller will
  	 * ensure that rq_vers is either 2 or 3.
  	 */
  	procnum = req->rq_proc;
 @@ -188,19 +189,19 @@ fha_extract_info(struct svc_req *req, st
  		procnum = nfsrv_nfsv3_procid[procnum];
  	}
  
 -	/* 
 -	 * We do affinity for most. However, we divide a realm of affinity 
 -	 * by file offset so as to allow for concurrent random access. We 
 -	 * only do this for reads today, but this may change when IFS supports 
 +	/*
 +	 * We do affinity for most.  However, we divide a realm of affinity
 +	 * by file offset so as to allow for concurrent random access.  We
 +	 * only do this for reads today, but this may change when IFS supports
  	 * efficient concurrent writes.
  	 */
  	if (procnum == NFSPROC_FSSTAT ||
  	    procnum == NFSPROC_FSINFO ||
  	    procnum == NFSPROC_PATHCONF ||
 -	    procnum == NFSPROC_NOOP || 
 +	    procnum == NFSPROC_NOOP ||
  	    procnum == NFSPROC_NULL)
  		goto out;
 -	
 +
  	nfs_realign(&req->rq_args);
  	md = req->rq_args;
  	dpos = mtod(md, caddr_t);
 @@ -270,8 +271,8 @@ fha_hash_entry_new(u_int64_t fh)
  	e->num_writes = 0;
  	e->num_threads = 0;
  	LIST_INIT(&e->threads);
 -	
 -	return e;
 +
 +	return (e);
  }
  
  static void
 @@ -296,10 +297,9 @@ fha_hash_entry_lookup(SVCPOOL *pool, u_i
  {
  	struct fha_hash_entry *fhe, *new_fhe;
  
 -	LIST_FOREACH(fhe, &g_fha.hashtable[fh % g_fha.hashmask], link) {
 +	LIST_FOREACH(fhe, &g_fha.hashtable[fh % g_fha.hashmask], link)
  		if (fhe->fh == fh)
  			break;
 -	}
  
  	if (!fhe) {
  		/* Allocate a new entry. */
 @@ -308,25 +308,24 @@ fha_hash_entry_lookup(SVCPOOL *pool, u_i
  		mtx_lock(&pool->sp_lock);
  
  		/* Double-check to make sure we still need the new entry. */
 -		LIST_FOREACH(fhe, &g_fha.hashtable[fh % g_fha.hashmask], link) {
 +		LIST_FOREACH(fhe, &g_fha.hashtable[fh % g_fha.hashmask], link)
  			if (fhe->fh == fh)
  				break;
 -		}
  		if (!fhe) {
  			fhe = new_fhe;
  			LIST_INSERT_HEAD(&g_fha.hashtable[fh % g_fha.hashmask],
  			    fhe, link);
 -		} else {
 +		} else
  			fha_hash_entry_destroy(new_fhe);
 -		}
  	}
  
 -	return fhe;
 +	return (fhe);
  }
  
  static void
  fha_hash_entry_add_thread(struct fha_hash_entry *fhe, SVCTHREAD *thread)
  {
 +
  	LIST_INSERT_HEAD(&fhe->threads, thread, st_alink);
  	fhe->num_threads++;
  }
 @@ -339,7 +338,7 @@ fha_hash_entry_remove_thread(struct fha_
  	fhe->num_threads--;
  }
  
 -/* 
 +/*
   * Account for an ongoing operation associated with this file.
   */
  static void
 @@ -365,7 +364,7 @@ get_idle_thread(SVCPOOL *pool)
  }
  
  
 -/* 
 +/*
   * Get the service thread currently associated with the fhe that is
   * appropriate to handle this operation.
   */
 @@ -387,15 +386,15 @@ fha_hash_entry_choose_thread(SVCPOOL *po
  		/* If there are any writes in progress, use the first thread. */
  		if (fhe->num_writes) {
  #if 0
 -			ITRACE_CURPROC(ITRACE_NFS, ITRACE_INFO, 
 +			ITRACE_CURPROC(ITRACE_NFS, ITRACE_INFO,
  			    "fha: %p(%d)w", thread, req_count);
  #endif
  			return (thread);
  		}
  
 -		/* 
 -		 * Check for read locality, making sure that we won't 
 -		 * exceed our per-thread load limit in the process. 
 +		/*
 +		 * Check for read locality, making sure that we won't
 +		 * exceed our per-thread load limit in the process.
  		 */
  		offset1 = i->offset >> fha_ctls.bin_shift;
  		offset2 = STAILQ_FIRST(&thread->st_reqs)->rq_p3
 @@ -404,21 +403,21 @@ fha_hash_entry_choose_thread(SVCPOOL *po
  			if ((fha_ctls.max_reqs_per_nfsd == 0) ||
  			    (req_count < fha_ctls.max_reqs_per_nfsd)) {
  #if 0
 -				ITRACE_CURPROC(ITRACE_NFS, ITRACE_INFO, 
 +				ITRACE_CURPROC(ITRACE_NFS, ITRACE_INFO,
  				    "fha: %p(%d)r", thread, req_count);
  #endif
  				return (thread);
  			}
  		}
  
 -		/* 
 +		/*
  		 * We don't have a locality match, so skip this thread,
 -		 * but keep track of the most attractive thread in case 
 +		 * but keep track of the most attractive thread in case
  		 * we need to come back to it later.
  		 */
  #if 0
 -		ITRACE_CURPROC(ITRACE_NFS, ITRACE_INFO, 
 -		    "fha: %p(%d)s off1 %llu off2 %llu", thread, 
 +		ITRACE_CURPROC(ITRACE_NFS, ITRACE_INFO,
 +		    "fha: %p(%d)s off1 %llu off2 %llu", thread,
  		    req_count, offset1, offset2);
  #endif
  		if ((min_thread == NULL) || (req_count < min_count)) {
 @@ -427,38 +426,38 @@ fha_hash_entry_choose_thread(SVCPOOL *po
  		}
  	}
  
 -	/* 
 -	 * We didn't find a good match yet. See if we can add 
 +	/*
 +	 * We didn't find a good match yet.  See if we can add
  	 * a new thread to this file handle entry's thread list.
  	 */
 -	if ((fha_ctls.max_nfsds_per_fh == 0) || 
 +	if ((fha_ctls.max_nfsds_per_fh == 0) ||
  	    (fhe->num_threads < fha_ctls.max_nfsds_per_fh)) {
 -		/* 
 -		 * We can add a new thread, so try for an idle thread 
 -		 * first, and fall back to this_thread if none are idle. 
 +		/*
 +		 * We can add a new thread, so try for an idle thread
 +		 * first, and fall back to this_thread if none are idle.
  		 */
  		if (STAILQ_EMPTY(&this_thread->st_reqs)) {
  			thread = this_thread;
  #if 0
 -			ITRACE_CURPROC(ITRACE_NFS, ITRACE_INFO, 
 +			ITRACE_CURPROC(ITRACE_NFS, ITRACE_INFO,
  			    "fha: %p(%d)t", thread, thread->st_reqcount);
  #endif
  		} else if ((thread = get_idle_thread(pool))) {
  #if 0
 -			ITRACE_CURPROC(ITRACE_NFS, ITRACE_INFO, 
 +			ITRACE_CURPROC(ITRACE_NFS, ITRACE_INFO,
  			    "fha: %p(%d)i", thread, thread->st_reqcount);
  #endif
 -		} else { 
 +		} else {
  			thread = this_thread;
  #if 0
 -			ITRACE_CURPROC(ITRACE_NFS, ITRACE_INFO, 
 +			ITRACE_CURPROC(ITRACE_NFS, ITRACE_INFO,
  			    "fha: %p(%d)b", thread, thread->st_reqcount);
  #endif
  		}
  		fha_hash_entry_add_thread(fhe, thread);
  	} else {
 -		/* 
 -		 * We don't want to use any more threads for this file, so 
 +		/*
 +		 * We don't want to use any more threads for this file, so
  		 * go back to the most attractive nfsd we're already using.
  		 */
  		thread = min_thread;
 @@ -467,8 +466,8 @@ fha_hash_entry_choose_thread(SVCPOOL *po
  	return (thread);
  }
  
 -/* 
 - * After getting a request, try to assign it to some thread. Usually we
 +/*
 + * After getting a request, try to assign it to some thread.  Usually we
   * handle it ourselves.
   */
  SVCTHREAD *
 @@ -491,16 +490,16 @@ fha_assign(SVCTHREAD *this_thread, struc
  	pool = req->rq_xprt->xp_pool;
  	fha_extract_info(req, &i);
  
 -	/* 
 -	 * We save the offset associated with this request for later 
 +	/*
 +	 * We save the offset associated with this request for later
  	 * nfsd matching.
  	 */
  	fhe = fha_hash_entry_lookup(pool, i.fh);
  	req->rq_p1 = fhe;
  	req->rq_p2 = i.locktype;
  	req->rq_p3 = i.offset;
 -	
 -	/* 
 +
 +	/*
  	 * Choose a thread, taking into consideration locality, thread load,
  	 * and the number of threads already working on this file.
  	 */
 @@ -511,8 +510,8 @@ fha_assign(SVCTHREAD *this_thread, struc
  	return (thread);
  }
  
 -/* 
 - * Called when we're done with an operation. The request has already
 +/*
 + * Called when we're done with an operation.  The request has already
   * been de-queued.
   */
  void
 
 Modified: head/sys/nfsserver/nfs_srvkrpc.c
 ==============================================================================
 --- head/sys/nfsserver/nfs_srvkrpc.c	Sat Jan  9 15:43:47 2010	(r201898)
 +++ head/sys/nfsserver/nfs_srvkrpc.c	Sat Jan  9 15:59:15 2010	(r201899)
 @@ -187,19 +187,18 @@ nfssvc_nfsserver(struct thread *td, stru
  		}
  		error = nfssvc_addsock(fp, td);
  		fdrop(fp, td);
 -	} else if (uap->flag & NFSSVC_OLDNFSD) {
 +	} else if (uap->flag & NFSSVC_OLDNFSD)
  		error = nfssvc_nfsd(td, NULL);
 -	} else if (uap->flag & NFSSVC_NFSD) {
 -		if (!uap->argp) 
 +	else if (uap->flag & NFSSVC_NFSD) {
 +		if (!uap->argp)
  			return (EINVAL);
  		error = copyin(uap->argp, (caddr_t)&nfsdarg,
  		    sizeof(nfsdarg));
  		if (error)
  			return (error);
  		error = nfssvc_nfsd(td, &nfsdarg);
 -	} else {
 +	} else
  		error = ENXIO;
 -	}
  	return (error);
  }
  
 @@ -447,9 +446,8 @@ nfssvc_addsock(struct file *fp, struct t
  
  	siz = sb_max_adj;
  	error = soreserve(so, siz, siz);
 -	if (error) {
 +	if (error)
  		return (error);
 -	}
  
  	/*
  	 * Steal the socket from userland so that it doesn't close
 @@ -471,7 +469,7 @@ nfssvc_addsock(struct file *fp, struct t
  }
  
  /*
 - * Called by nfssvc() for nfsds. Just loops around servicing rpc requests
 + * Called by nfssvc() for nfsds.  Just loops around servicing rpc requests
   * until it is killed by a signal.
   */
  static int
 @@ -496,9 +494,9 @@ nfssvc_nfsd(struct thread *td, struct nf
  #endif
  
  	/*
 -	 * Only the first nfsd actually does any work. The RPC code
 -	 * adds threads to it as needed. Any extra processes offered
 -	 * by nfsd just exit. If nfsd is new enough, it will call us
 +	 * Only the first nfsd actually does any work.  The RPC code
 +	 * adds threads to it as needed.  Any extra processes offered
 +	 * by nfsd just exit.  If nfsd is new enough, it will call us
  	 * once with a structure that specifies how many threads to
  	 * use.
  	 */
 @@ -522,7 +520,7 @@ nfssvc_nfsd(struct thread *td, struct nf
  			nfsrv_pool->sp_minthreads = 4;
  			nfsrv_pool->sp_maxthreads = 4;
  		}
 -			
 +
  		svc_run(nfsrv_pool);
  
  #ifdef KGSSAPI
 @@ -541,7 +539,7 @@ nfssvc_nfsd(struct thread *td, struct nf
  
  /*
   * Size the NFS server's duplicate request cache at 1/2 the
 - * nmbclusters, floating within a (64, 2048) range. This is to
 + * nmbclusters, floating within a (64, 2048) range.  This is to
   * prevent all mbuf clusters being tied up in the NFS dupreq
   * cache for small values of nmbclusters.
   */
 _______________________________________________
 svn-src-all@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/svn-src-all
 To unsubscribe, send any mail to "svn-src-all-unsubscribe@freebsd.org"
 

From: dfilter@FreeBSD.ORG (dfilter service)
To: bug-followup@FreeBSD.org
Cc:  
Subject: Re: sparc64/142102: commit references a PR
Date: Sat, 16 Jan 2010 12:16:53 +0000 (UTC)

 Author: marius
 Date: Sat Jan 16 12:16:38 2010
 New Revision: 202438
 URL: http://svn.freebsd.org/changeset/base/202438
 
 Log:
   MFC: r201896
   
   As nfsm_srvmtofh_xx() assumes the 4-byte alignment required by XDR
   ensure the mbuf data is aligned accordingly by calling nfs_realign()
   in fha_extract_info(). This fix is orthogonal to the problem solved
   by r199274/r199284 (MFC'ed to stable/8 in r199733).
   
   PR:		142102 (second part)
 
 Modified:
   stable/8/sys/nfsserver/nfs.h
   stable/8/sys/nfsserver/nfs_fha.c
   stable/8/sys/nfsserver/nfs_srvkrpc.c
 Directory Properties:
   stable/8/sys/   (props changed)
   stable/8/sys/amd64/include/xen/   (props changed)
   stable/8/sys/cddl/contrib/opensolaris/   (props changed)
   stable/8/sys/contrib/dev/acpica/   (props changed)
   stable/8/sys/contrib/pf/   (props changed)
   stable/8/sys/dev/xen/xenpci/   (props changed)
 
 Modified: stable/8/sys/nfsserver/nfs.h
 ==============================================================================
 --- stable/8/sys/nfsserver/nfs.h	Sat Jan 16 09:52:49 2010	(r202437)
 +++ stable/8/sys/nfsserver/nfs.h	Sat Jan 16 12:16:38 2010	(r202438)
 @@ -240,6 +240,7 @@ extern int nfs_debug;
  
  #endif
  
 +void	nfs_realign(struct mbuf **);
  struct mbuf *nfs_rephead(int, struct nfsrv_descript *, int, struct mbuf **,
  	    caddr_t *);
  void	nfsm_srvfattr(struct nfsrv_descript *, struct vattr *,
 
 Modified: stable/8/sys/nfsserver/nfs_fha.c
 ==============================================================================
 --- stable/8/sys/nfsserver/nfs_fha.c	Sat Jan 16 09:52:49 2010	(r202437)
 +++ stable/8/sys/nfsserver/nfs_fha.c	Sat Jan 16 12:16:38 2010	(r202438)
 @@ -158,9 +158,9 @@ SYSUNINIT(nfs_fha, SI_SUB_ROOT_CONF, SI_
  static void
  fha_extract_info(struct svc_req *req, struct fha_info *i)
  {
 -	struct mbuf *md = req->rq_args;
 +	struct mbuf *md;
  	nfsfh_t fh;
 -	caddr_t dpos = mtod(md, caddr_t);
 +	caddr_t dpos;
  	static u_int64_t random_fh = 0;
  	int error;
  	int v3 = (req->rq_vers == 3);
 @@ -201,6 +201,10 @@ fha_extract_info(struct svc_req *req, st
  	    procnum == NFSPROC_NULL)
  		goto out;
  	
 +	nfs_realign(&req->rq_args);
 +	md = req->rq_args;
 +	dpos = mtod(md, caddr_t);
 +
  	/* Grab the filehandle. */
  	error = nfsm_srvmtofh_xx(&fh.fh_generic, v3, &md, &dpos);
  	if (error)
 
 Modified: stable/8/sys/nfsserver/nfs_srvkrpc.c
 ==============================================================================
 --- stable/8/sys/nfsserver/nfs_srvkrpc.c	Sat Jan 16 09:52:49 2010	(r202437)
 +++ stable/8/sys/nfsserver/nfs_srvkrpc.c	Sat Jan 16 12:16:38 2010	(r202438)
 @@ -266,7 +266,7 @@ nfs_rephead(int siz, struct nfsrv_descri
   *	not occur with NFS/UDP and is supposed to only occassionally occur
   *	with TCP.  Use vfs.nfs.realign_count and realign_test to check this.
   */
 -static void
 +void
  nfs_realign(struct mbuf **pm)	/* XXX COMMON */
  {
  	struct mbuf *m;
 _______________________________________________
 svn-src-all@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/svn-src-all
 To unsubscribe, send any mail to "svn-src-all-unsubscribe@freebsd.org"
 
State-Changed-From-To: open->closed 
State-Changed-By: marius 
State-Changed-When: Mon Dec 12 19:00:33 UTC 2011 
State-Changed-Why:  
Close as this PR has been fully dealt with. It was just kept open due to a 
possible release errata, which then turned out to be infeasible. 

http://www.freebsd.org/cgi/query-pr.cgi?pr=142102 
>Unformatted: