From mala@hinterbergen.de Mon Dec 28 14:25:59 2009 Return-Path: Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 0F52B10656C5 for ; Mon, 28 Dec 2009 14:25:59 +0000 (UTC) (envelope-from mala@hinterbergen.de) Received: from moutng.kundenserver.de (moutng.kundenserver.de [212.227.17.9]) by mx1.freebsd.org (Postfix) with ESMTP id 9F20D8FC19 for ; Mon, 28 Dec 2009 14:25:58 +0000 (UTC) Received: from router.hinter.bergen2 (pD95489F0.dip0.t-ipconnect.de [217.84.137.240]) by mrelayeu.kundenserver.de (node=mrbap1) with ESMTP (Nemesis) id 0MeMN5-1NcEDY1kqN-00PaWv; Mon, 28 Dec 2009 15:13:19 +0100 Received: by router.hinter.bergen2 (Postfix, from userid 1000) id 8D83B5C04; Mon, 28 Dec 2009 15:13:15 +0100 (CET) Message-Id: <20091228141315.8D83B5C04@router.hinter.bergen2> Date: Mon, 28 Dec 2009 15:13:15 +0100 (CET) From: Manuel Schiller Reply-To: Manuel Schiller To: FreeBSD-gnats-submit@freebsd.org Cc: Manuel Schiller Subject: FreeBSD 8.0 kernel panics on sparc64 when accessing NFS X-Send-Pr-Version: 3.113 X-GNATS-Notify: >Number: 142102 >Category: sparc64 >Synopsis: [nfs] [panic] FreeBSD 8.0 kernel panics on sparc64 when accessing NFS >Confidential: no >Severity: serious >Priority: medium >Responsible: freebsd-sparc64 >State: closed >Quarter: >Keywords: >Date-Required: >Class: sw-bug >Submitter-Id: current-users >Arrival-Date: Mon Dec 28 14:30:01 UTC 2009 >Closed-Date: Mon Dec 12 19:02:27 UTC 2011 >Last-Modified: Mon Dec 12 19:02:27 UTC 2011 >Originator: Manuel Schiller >Release: FreeBSD 8.0-RELEASE-p1 sparc64 >Organization: (none) >Environment: System: FreeBSD router.hinter.bergen2 8.0-RELEASE-p1 FreeBSD 8.0-RELEASE-p1 #4: Mon Dec 28 11:53:32 CET 2009 root@router.hinter.bergen2:/usr/obj/usr/src/sys/FENCHURCH sparc64 The machine is a SUN Netra T1 running FreeBSD 8.0. The kernel configuration is almost what ships as GENERIC, only a few unneeded NIC drivers have been removed, and KDB and DDB have been enabled. >Description: The kernel panics due to an unaligned memory access in the NFS server code. I could trace the problem to the statement on line 209 of sys/nfsserver/nfs_fha.c which does not satify the alignment requirements which sparc64 imposes. The backtrace I get from my kernel is below (sorry, kgdb does not work properly on sparc64, so I have to use the in-kernel backtrace...): panic: trap: memory address not aligned cpuid = 0 KDB: stack backtrace: panic() at panic+0x1c8 trap() at trap+0x4d0 -- memory address not aligned sfar=0xd6663424 sfsr=0x40029 %o7=0xc051b86c -- fha_assign() at fha_assign+0x140 svc_run_internal() at svc_run_internal+0x71c svc_thread_start() at svc_thread_start+0x8 fork_exit() at fork_exit+0x80 fork_trampoline() at fork_trampoline+0x8 Uptime: 1m12s Dumping 512 MB (2 chunks) chunk at 0: 268435456 bytes |\ If you need anything or want something tested, please let me know. >How-To-Repeat: Enable the NFS server options in /etc/rc.d on a sparc64 machine and try to mount any NFS share from a different machine. >Fix: Replace the assignment on line 209 of sys/nfsserver/nfs_fha.c with a call to memcpy, i.e. something like: memcpy(&i->fh, &fh.fh_generic.fh_fid.fid_data, sizeof(const u_int64_t *)); This is ugly, but will work for all architectures, regardless of alignment constraints imposed. This is likely to be a little slower than the original code, but I doubt the difference will be noticeable (we're talking about different ways to copy eight bytes, so the code would have to be called in a very tight loop for things to become noticeably slower). The issue with the original assignment is that the structs in question do not satisfy the alignment contstraints on sparc64 for reading a pointer from fid_data. Instead of playing with the struct to get the alignment right, it seemed safer to me to use a memcpy here, since in future modifications of the struct, the alignment constraints are most likely not at the front of the mind of the person doing the modification. The change has been tested on sparc64 for half a day, and things seem to be running smoothly. >Release-Note: >Audit-Trail: From: Manuel Tobias Schiller To: bug-followup@FreeBSD.org, mala@hinterbergen.de Cc: Subject: Re: sparc64/142102: FreeBSD 8.0 kernel panics on sparc64 when accessing NFS Date: Mon, 28 Dec 2009 16:40:04 +0100 Dear all, the fix I gave in the original problem report appears to be incomplete; the kernel still crashes sometimes during NFS access due to improper alignment, this time in sys/nfsserver/nfs_srvsubs.c, line 1355. I will try to continue debugging the issue and report back with any fixes I manage to obtain. Best regards, Manuel -- Homepage: http://www.hinterbergen.de/mala OpenPGP: 0xA330353E (DSA) or 0xD87D188C (RSA) From: Marius Strobl To: Manuel Tobias Schiller Cc: freebsd-sparc64@FreeBSD.org Subject: Re: sparc64/142102: FreeBSD 8.0 kernel panics on sparc64 when accessing NFS Date: Mon, 28 Dec 2009 17:28:17 +0100 On Mon, Dec 28, 2009 at 04:00:09PM +0000, Manuel Tobias Schiller wrote: > The following reply was made to PR sparc64/142102; it has been noted by GNATS. > > From: Manuel Tobias Schiller > To: bug-followup@FreeBSD.org, mala@hinterbergen.de > Cc: > Subject: Re: sparc64/142102: FreeBSD 8.0 kernel panics on sparc64 when > accessing NFS > Date: Mon, 28 Dec 2009 16:40:04 +0100 > > Dear all, > > the fix I gave in the original problem report appears to be incomplete; > the kernel still crashes sometimes during NFS access due to improper > alignment, this time in sys/nfsserver/nfs_srvsubs.c, line 1355. I will > try to continue debugging the issue and report back with any fixes I > manage to obtain. > Note that the original PR is a dupe of PR 140797. Could the second panic you're seeing be a result of using an incorrect size parameter in your change? Marius From: Marius Strobl To: Manuel Tobias Schiller Cc: bug-followup@FreeBSD.org Subject: Re: sparc64/142102: FreeBSD 8.0 kernel panics on sparc64 when accessing NFS Date: Mon, 28 Dec 2009 19:26:47 +0100 On Mon, Dec 28, 2009 at 06:07:36PM +0100, Manuel Tobias Schiller wrote: > On Mon, 28 Dec 2009 17:28:18 +0100 > Marius Strobl wrote: > > > On Mon, Dec 28, 2009 at 04:00:09PM +0000, Manuel Tobias Schiller wrote: > > > The following reply was made to PR sparc64/142102; it has been noted > > > by GNATS. > > > > > > From: Manuel Tobias Schiller > > > To: bug-followup@FreeBSD.org, mala@hinterbergen.de > > > Cc: > > > Subject: Re: sparc64/142102: FreeBSD 8.0 kernel panics on sparc64 when > > > accessing NFS > > > Date: Mon, 28 Dec 2009 16:40:04 +0100 > > > > > > Dear all, > > > > > > the fix I gave in the original problem report appears to be > > > incomplete; the kernel still crashes sometimes during NFS access due > > > to improper alignment, this time in sys/nfsserver/nfs_srvsubs.c, line > > > 1355. I will try to continue debugging the issue and report back with > > > any fixes I manage to obtain. > > > > > > > Note that the original PR is a dupe of PR 140797. Could the second > > panic you're seeing be a result of using an incorrect size parameter > > in your change? > > > > Marius > > > > > Dear Marius, > > sorry, I searched, but I failed to spot PR 140797. I don't think that the > panic I'm seeing after my change has to do with the size parameter (if > it's incorrect though, please let me know). With the change I can read > files and do directory lookups just fine (tested with a 45 Gigabyte > file). I get the crash the minute I'm trying to write a file to the NFS > share. > Yes, the size parameter in your patch is wrong. I also can't reproduce your remaining problem with the stock sources, I haven't checked whether it could be releated with using an incorrect size parameter though. Marius From: Manuel Tobias Schiller To: Marius Strobl Cc: bug-followup@FreeBSD.org, Manuel Tobias Schiller Subject: Re: sparc64/142102: FreeBSD 8.0 kernel panics on sparc64 when accessing NFS Date: Mon, 28 Dec 2009 20:15:24 +0100 On Mon, 28 Dec 2009 19:26:47 +0100 Marius Strobl wrote: > > Dear Marius, > > > > sorry, I searched, but I failed to spot PR 140797. I don't think that > > the panic I'm seeing after my change has to do with the size > > parameter (if it's incorrect though, please let me know). With the > > change I can read files and do directory lookups just fine (tested > > with a 45 Gigabyte file). I get the crash the minute I'm trying to > > write a file to the NFS share. > > > > Yes, the size parameter in your patch is wrong. I also can't reproduce > your remaining problem with the stock sources, I haven't checked whether > it could be releated with using an incorrect size parameter though. > > Marius Dear Marius, looking a bit more closely, I also agree that the size parameter is incorrect. However, it should not make a difference on sparc64, since sizeof(u_int64_t *) = 8 = sizeof(u_int64_t). I started a new kernel build with the official fix in PR 140797 right after I wrote my last e-mail, and with any luck, it should finish in an hour or so. I'll let you know if I can reproduce the panic with the official fix. Thanks, Manuel -- Homepage: http://www.hinterbergen.de/mala OpenPGP: 0xA330353E (DSA) or 0xD87D188C (RSA) From: Mark Linimon To: Manuel Schiller Cc: FreeBSD-gnats-submit@FreeBSD.org Subject: Re: sparc64/142102: FreeBSD 8.0 kernel panics on sparc64 when accessing NFS Date: Mon, 28 Dec 2009 14:26:50 -0600 Just to add a data point, the Netra T1s in the package build cluster do not show this problem, so I'm guessing that you are hitting an edge on large file transfers. We run them pretty hard. mcl From: Manuel Tobias Schiller To: Mark Linimon Cc: FreeBSD-gnats-submit@FreeBSD.org, Marius Strobl , Manuel Tobias Schiller Subject: Re: sparc64/142102: FreeBSD 8.0 kernel panics on sparc64 when accessing NFS Date: Mon, 28 Dec 2009 22:06:02 +0100 On Mon, 28 Dec 2009 14:26:50 -0600 Mark Linimon wrote: > Just to add a data point, the Netra T1s in the package build cluster do > not show this problem, so I'm guessing that you are hitting an edge > on large file transfers. We run them pretty hard. > > mcl > Hi Mark, Marius, thanks for the data point. In the meantime, I managed to test a kernel with the "official" fix from PR 140797, and I still get the crash when trying to write a moderately sized file (20 Megabytes) to NFS. According to the backtrace below, it crashes in line 1355 of /usr/src/sys/nfsserver/nfs_srvsubs.c. Since I'm using the fix from the original problem report and RELENG_8_0 sources cvsupped on December 20, I guess that means we are either using different source trees (i.e. there is something in the sources for the kernels used on your machines that helps), or there is some other difference which I have not been able to pinpoint yet. Maybe you could clarify before I try hacking away or finding other differences... Thanks, Manuel ---- backtrack of panic follows ---- panic: trap: memory address not aligned cpuid = 0 KDB: stack backtrace: panic() at panic+0x1c8 trap() at trap+0x4d0 -- memory address not aligned sfar=0xfffff800016d08aa sfsr=0x40029 %o7=0xc0528934 -- nfsm_srvmtofh_xx() at nfsm_srvmtofh_xx+0x24 fha_assign() at fha_assign+0x12c svc_run_internal() at svc_run_internal+0x71c svc_thread_start() at svc_thread_start+0x8 fork_exit() at fork_exit+0x80 fork_trampoline() at fork_trampoline+0x8 Uptime: 2m1s Dumping 512 MB (2 chunks) chunk at 0: 268435456 bytes | -- Homepage: http://www.hinterbergen.de/mala OpenPGP: 0xA330353E (DSA) or 0xD87D188C (RSA) From: Mark Linimon To: Manuel Tobias Schiller Cc: FreeBSD-gnats-submit@FreeBSD.org, Marius Strobl Subject: Re: sparc64/142102: FreeBSD 8.0 kernel panics on sparc64 when accessing NFS Date: Mon, 28 Dec 2009 15:18:09 -0600 On Mon, Dec 28, 2009 at 10:06:02PM +0100, Manuel Tobias Schiller wrote: > Since I'm using the fix from the original problem report and RELENG_8_0 > sources cvsupped on December 20, I guess that means we are either using > different source trees (i.e. there is something in the sources for the > kernels used on your machines that helps), or there is some other > difference which I have not been able to pinpoint yet. It turns out the kernels on the T1s is much older than I thought. I know I have been using something pretty close to 8.0R on one of my machines at home but I am traveling right now and accessing it would be kind of a pain. All our other sparcs don't use NFS to the same extent the above do (they are pxebooted and e.g. /usr will go across NFS.) mcl From: Manuel Tobias Schiller To: Mark Linimon Cc: FreeBSD-gnats-submit@FreeBSD.org, Marius Strobl , Manuel Tobias Schiller Subject: Re: sparc64/142102: FreeBSD 8.0 kernel panics on sparc64 when accessing NFS Date: Mon, 28 Dec 2009 22:40:47 +0100 On Mon, 28 Dec 2009 15:18:09 -0600 Mark Linimon wrote: > It turns out the kernels on the T1s is much older than I thought. > I know I have been using something pretty close to 8.0R on one of > my machines at home but I am traveling right now and accessing it > would be kind of a pain. > > All our other sparcs don't use NFS to the same extent the above do > (they are pxebooted and e.g. /usr will go across NFS.) > > mcl > Hi Mark, thanks for the clarification. With the situation being what you describe, I think I'll try upgrading my sources to the latest RELENG_8_0 ones to be on the safe side, check if the problem persists, and if so, I'll try to find a solution (unless I hear from someone to try something different). Another question springs to mind: If /usr goes across NFS, most traffic arriving at the NFS server should be reading, right? Because that appears to work without a glitch with the fix (at least on my machine). Right now, the panic is pushed back to the point when I try to write (i.e. copy a new file to the NFS server). Thanks for the quick reply, Manuel -- Homepage: http://www.hinterbergen.de/mala OpenPGP: 0xA330353E (DSA) or 0xD87D188C (RSA) From: Mark Linimon To: Manuel Tobias Schiller Cc: FreeBSD-gnats-submit@FreeBSD.org, Marius Strobl Subject: Re: sparc64/142102: FreeBSD 8.0 kernel panics on sparc64 when accessing NFS Date: Mon, 28 Dec 2009 15:44:21 -0600 On Mon, Dec 28, 2009 at 10:40:47PM +0100, Manuel Tobias Schiller wrote: > Another question springs to mind: If /usr goes across NFS, most traffic > arriving at the NFS server should be reading, right? Yes, as a matter of fact, all the traffic on the Netras is r/o. However, on the 280R that I have at home, I updated an entire ports tree over an NFS-mounted /usr mounted r/w. That's the machine I'll check in a few days. mcl From: Marius Strobl To: Manuel Tobias Schiller Cc: Mark Linimon , FreeBSD-gnats-submit@FreeBSD.org Subject: Re: sparc64/142102: FreeBSD 8.0 kernel panics on sparc64 when accessing NFS Date: Tue, 29 Dec 2009 01:58:29 +0100 On Mon, Dec 28, 2009 at 10:06:02PM +0100, Manuel Tobias Schiller wrote: > On Mon, 28 Dec 2009 14:26:50 -0600 > Mark Linimon wrote: > > > Just to add a data point, the Netra T1s in the package build cluster do > > not show this problem, so I'm guessing that you are hitting an edge > > on large file transfers. We run them pretty hard. I think you guys are talking about different things; AFAIK the package build cluster nodes only act as NFS clients but this problem is about when using machines with strict alignment requirements as an NFS server. > > > > mcl > > > Hi Mark, Marius, > > thanks for the data point. In the meantime, I managed to test a kernel > with the "official" fix from PR 140797, and I still get the crash when > trying to write a moderately sized file (20 Megabytes) to NFS. According > to the backtrace below, it crashes in line 1355 of > /usr/src/sys/nfsserver/nfs_srvsubs.c. > > Since I'm using the fix from the original problem report and RELENG_8_0 > sources cvsupped on December 20, I guess that means we are either using > different source trees (i.e. there is something in the sources for the > kernels used on your machines that helps), or there is some other > difference which I have not been able to pinpoint yet. Maybe you could > clarify before I try hacking away or finding other differences... > > Thanks, > > Manuel > > > ---- backtrack of panic follows ---- > panic: trap: memory address not aligned > cpuid = 0 > KDB: stack backtrace: > panic() at panic+0x1c8 > trap() at trap+0x4d0 > -- memory address not aligned sfar=0xfffff800016d08aa sfsr=0x40029 > %o7=0xc0528934 -- nfsm_srvmtofh_xx() at nfsm_srvmtofh_xx+0x24 > fha_assign() at fha_assign+0x12c > svc_run_internal() at svc_run_internal+0x71c > svc_thread_start() at svc_thread_start+0x8 > fork_exit() at fork_exit+0x80 > fork_trampoline() at fork_trampoline+0x8 > Uptime: 2m1s > Dumping 512 MB (2 chunks) > chunk at 0: 268435456 bytes | > I'm using a more or less current HEAD but the NFS code hasn't changed that much since 8.0, at least it doesn't contain any other alignment fixes I'm aware of. I think I got what the problem is but I still haven't managed to reproduce it so far. Could you please test whether the following patch makes a difference? http://people.freebsd.org/~marius/fha_extract_info_realign.diff Marius From: Manuel Tobias Schiller To: Marius Strobl Cc: Mark Linimon , FreeBSD-gnats-submit@FreeBSD.org, Manuel Tobias Schiller Subject: Re: sparc64/142102: FreeBSD 8.0 kernel panics on sparc64 when accessing NFS Date: Tue, 29 Dec 2009 11:15:16 +0100 On Tue, 29 Dec 2009 01:58:29 +0100 Marius Strobl wrote: > I think you guys are talking about different things; AFAIK the > package build cluster nodes only act as NFS clients but this > problem is about when using machines with strict alignment > requirements as an NFS server. That might explain why Mark does not see panics (assuming the NFS client code has had more testing on sparc64 or similar architectures which does not seem unlikely (e.g. in the build cluster))... > I'm using a more or less current HEAD but the NFS code hasn't > changed that much since 8.0, at least it doesn't contain any > other alignment fixes I'm aware of. > I think I got what the problem is but I still haven't managed > to reproduce it so far. Could you please test whether the > following patch makes a difference? > http://people.freebsd.org/~marius/fha_extract_info_realign.diff > > Marius Your patch applies more or less cleanly (a line or two offset here and there, but nothing to worry about AFAICT), and make buildkernel is having a go at the sources. I'll let you know what comes out. Thanks, Manuel -- Homepage: http://www.hinterbergen.de/mala OpenPGP: 0xA330353E (DSA) or 0xD87D188C (RSA) From: Manuel Tobias Schiller To: Marius Strobl Cc: Mark Linimon , FreeBSD-gnats-submit@FreeBSD.org Subject: Re: sparc64/142102: FreeBSD 8.0 kernel panics on sparc64 when accessing NFS Date: Tue, 29 Dec 2009 16:34:11 +0100 On Tue, 29 Dec 2009 01:58:29 +0100 Marius Strobl wrote: > I'm using a more or less current HEAD but the NFS code hasn't > changed that much since 8.0, at least it doesn't contain any > other alignment fixes I'm aware of. > I think I got what the problem is but I still haven't managed > to reproduce it so far. Could you please test whether the > following patch makes a difference? > http://people.freebsd.org/~marius/fha_extract_info_realign.diff > > Marius I applied the patch, compiled the kernel and rebooted. NFS read access seems to work as before. When I try to write to the server, the server locks up solid, i.e. I do not even get a kernel panic reported on the serial console, so I can't give you a backtrace or anything. I had to disconnect the cable from the wall outlet to the power supply to restart it. It came up without problems, though. Manuel -- Homepage: http://www.hinterbergen.de/mala OpenPGP: 0xA330353E (DSA) or 0xD87D188C (RSA) From: Marius Strobl To: Manuel Tobias Schiller Cc: Mark Linimon , FreeBSD-gnats-submit@FreeBSD.org Subject: Re: sparc64/142102: FreeBSD 8.0 kernel panics on sparc64 when accessing NFS Date: Tue, 29 Dec 2009 19:53:58 +0100 On Tue, Dec 29, 2009 at 04:34:11PM +0100, Manuel Tobias Schiller wrote: > On Tue, 29 Dec 2009 01:58:29 +0100 > Marius Strobl wrote: > > I'm using a more or less current HEAD but the NFS code hasn't > > changed that much since 8.0, at least it doesn't contain any > > other alignment fixes I'm aware of. > > I think I got what the problem is but I still haven't managed > > to reproduce it so far. Could you please test whether the > > following patch makes a difference? > > http://people.freebsd.org/~marius/fha_extract_info_realign.diff > > > > Marius > > I applied the patch, compiled the kernel and rebooted. NFS read access > seems to work as before. When I try to write to the server, the server > locks up solid, i.e. I do not even get a kernel panic reported on the > serial console, so I can't give you a backtrace or anything. I had to > disconnect the cable from the wall outlet to the power supply to restart > it. It came up without problems, though. > Oh, sorry, I had a bug in there, the svc code used a stale pointer to the mbuf with this. Could you please re-fetch and try again? I can't guarantee that this fixes the hang you experienced but at least in theory just re-aligning the data can't make the issue with the unaligned access any worse. Using the LOM or by breaking into the kernel by sending a break you should be able to power-cycle/ reboot the machine without removing the power cord should it hang again though. What NFS client and with which mount options are you using to trigger is problem? Marius From: Manuel Tobias Schiller To: Marius Strobl Cc: Mark Linimon , FreeBSD-gnats-submit@FreeBSD.org Subject: Re: sparc64/142102: FreeBSD 8.0 kernel panics on sparc64 when accessing NFS Date: Tue, 29 Dec 2009 20:29:55 +0100 On Tue, 29 Dec 2009 19:53:58 +0100 Marius Strobl wrote: > Oh, sorry, I had a bug in there, the svc code used a stale > pointer to the mbuf with this. Could you please re-fetch > and try again? I can't guarantee that this fixes the hang > you experienced but at least in theory just re-aligning > the data can't make the issue with the unaligned access > any worse. No problem, I know what it's like if you try to come up with a fix without being able to check things for yourself... I think that I can start a rebuild in about two hours or so, so we should have results by tomorrow afternoon (tomorrow morning is reserved for my grandparents...). At the moment, I'm compiling a "dumb" variant which just uses bcopy for all operations in sys/nfs/xdr_subs.h - if that compiles and works (I tend to make mistakes when I code, and a make buildkernel takes quite some time), we should have a safe version to fall back on, no matter what alignment constraints the architecture imposes... It's going to be ugly, though, so I'd rather use your solution, if possible. I guess we just have to wait and see. > Using the LOM or by breaking into the kernel > by sending a break you should be able to power-cycle/ > reboot the machine without removing the power cord should > it hang again though. I know about the sending-a-break-over-serial-trick, and of course I tried it, but there was no reaction whatsoever. > What NFS client and with which mount options are you > using to trigger is problem? > > Marius Concerning the NFS client machines: They are running Debian lenny, with a 2.6.30-bpo.2-amd64 kernel (one client is a powerpc G4, the other one is an amd64 machine). Mount options on the Linux clients are rw,nosuid,nodev,hard,intr. This combination used to work fine and rock-stable with FreeBSD 7.2 (during the month I spent at CERN just before Christmas, the machine was up the entire time and never showed any sign of instability - admittedly, there was no NFS access during that time). I hope we manage to get the FreeBSD 8.0 kernel there as well - apart from the NFS issue, what I've seen so far is impressive, as usual... Manuel -- Homepage: http://www.hinterbergen.de/mala OpenPGP: 0xA330353E (DSA) or 0xD87D188C (RSA) From: Manuel Tobias Schiller To: Marius Strobl Cc: Mark Linimon , FreeBSD-gnats-submit@FreeBSD.org, Manuel Tobias Schiller Subject: Re: sparc64/142102: FreeBSD 8.0 kernel panics on sparc64 when accessing NFS Date: Wed, 30 Dec 2009 01:25:55 +0100 On Tue, 29 Dec 2009 19:53:58 +0100 Marius Strobl wrote: > Oh, sorry, I had a bug in there, the svc code used a stale > pointer to the mbuf with this. Could you please re-fetch > and try again? I can't guarantee that this fixes the hang > you experienced but at least in theory just re-aligning > the data can't make the issue with the unaligned access > any worse. Using the LOM or by breaking into the kernel > by sending a break you should be able to power-cycle/ > reboot the machine without removing the power cord should > it hang again though. > What NFS client and with which mount options are you > using to trigger is problem? > > Marius I managed to compile and test a kernel with your updated patch, and things seem to work. I'll try to stress-test it over the night by copying a few tens of gigabytes over and report back with the results. Also, if I'm reading your patch correctly, we should be able to do without the bcopy hack from PR 140797, so I'll recompile a kernel without it and try if that works as well. Thanks a lot for all your help and your patience! Manuel -- Homepage: http://www.hinterbergen.de/mala OpenPGP: 0xA330353E (DSA) or 0xD87D188C (RSA) From: Marius Strobl To: Manuel Tobias Schiller Cc: Mark Linimon , FreeBSD-gnats-submit@FreeBSD.org Subject: Re: sparc64/142102: FreeBSD 8.0 kernel panics on sparc64 when accessing NFS Date: Wed, 30 Dec 2009 02:08:01 +0100 On Wed, Dec 30, 2009 at 01:25:55AM +0100, Manuel Tobias Schiller wrote: > On Tue, 29 Dec 2009 19:53:58 +0100 > Marius Strobl wrote: > > Oh, sorry, I had a bug in there, the svc code used a stale > > pointer to the mbuf with this. Could you please re-fetch > > and try again? I can't guarantee that this fixes the hang > > you experienced but at least in theory just re-aligning > > the data can't make the issue with the unaligned access > > any worse. Using the LOM or by breaking into the kernel > > by sending a break you should be able to power-cycle/ > > reboot the machine without removing the power cord should > > it hang again though. > > What NFS client and with which mount options are you > > using to trigger is problem? > > > > Marius > > I managed to compile and test a kernel with your updated patch, and things > seem to work. I'll try to stress-test it over the night by copying a few > tens of gigabytes over and report back with the results. Also, if I'm > reading your patch correctly, we should be able to do without the bcopy > hack from PR 140797, so I'll recompile a kernel without it and try if that > works as well. I don't think so; nfs_realign() only guarantees 4-byte alignment as required by XDR and assumed by nfsm_srvmtofh_xx() further down the road but nfsfh_t needs 8-byte alignment. > > Thanks a lot for all your help and your patience! > Well, thanks a lot for testing so far! Marius From: Marius Strobl To: dfr@freebsd.org Cc: Manuel Tobias Schiller , FreeBSD-gnats-submit@freebsd.org Subject: Re: sparc64/142102: FreeBSD 8.0 kernel panics on sparc64 when accessing NFS Date: Wed, 30 Dec 2009 17:36:27 +0100 On Wed, Dec 30, 2009 at 01:25:55AM +0100, Manuel Tobias Schiller wrote: > On Tue, 29 Dec 2009 19:53:58 +0100 > Marius Strobl wrote: > > Oh, sorry, I had a bug in there, the svc code used a stale > > pointer to the mbuf with this. Could you please re-fetch > > and try again? I can't guarantee that this fixes the hang > > you experienced but at least in theory just re-aligning > > the data can't make the issue with the unaligned access > > any worse. Using the LOM or by breaking into the kernel > > by sending a break you should be able to power-cycle/ > > reboot the machine without removing the power cord should > > it hang again though. > > What NFS client and with which mount options are you > > using to trigger is problem? > > > > Marius > > I managed to compile and test a kernel with your updated patch, and things > seem to work. I'll try to stress-test it over the night by copying a few > tens of gigabytes over and report back with the results. Doug, could you please review the following patch? The problem apparently is that nfsm_srvmtofh_xx() assumes the 4-byte alignment required by XDR so fha_extract_info() has to ensure that the mbuf data is aligned accordingly. http://people.freebsd.org/~marius/fha_extract_info_realign.diff Thanks, Marius From: Manuel Tobias Schiller To: Marius Strobl Cc: dfr@freebsd.org, FreeBSD-gnats-submit@freebsd.org Subject: Re: sparc64/142102: FreeBSD 8.0 kernel panics on sparc64 when accessing NFS Date: Wed, 30 Dec 2009 20:32:55 +0100 On Wed, 30 Dec 2009 17:36:27 +0100 Marius Strobl wrote: > On Wed, Dec 30, 2009 at 01:25:55AM +0100, Manuel Tobias Schiller wrote: > > On Tue, 29 Dec 2009 19:53:58 +0100 > > Marius Strobl wrote: > > > Oh, sorry, I had a bug in there, the svc code used a stale > > > pointer to the mbuf with this. Could you please re-fetch > > > and try again? I can't guarantee that this fixes the hang > > > you experienced but at least in theory just re-aligning > > > the data can't make the issue with the unaligned access > > > any worse. Using the LOM or by breaking into the kernel > > > by sending a break you should be able to power-cycle/ > > > reboot the machine without removing the power cord should > > > it hang again though. > > > What NFS client and with which mount options are you > > > using to trigger is problem? > > > > > > Marius > > > > I managed to compile and test a kernel with your updated patch, and > > things seem to work. I'll try to stress-test it over the night by > > copying a few tens of gigabytes over and report back with the results. > > Doug, > > could you please review the following patch? The problem > apparently is that nfsm_srvmtofh_xx() assumes the 4-byte > alignment required by XDR so fha_extract_info() has to > ensure that the mbuf data is aligned accordingly. > http://people.freebsd.org/~marius/fha_extract_info_realign.diff > > Thanks, > Marius Marius, I did a stress-test of the patch by tarring 85 Gigabyte over NFS into a pipe, untarring them in a different NFS-mounted directory. I loaded the server machine by compiling a kernel in parallel for some time. Everything worked perfectly. Thanks for all your help! Manuel -- Homepage: http://www.hinterbergen.de/mala OpenPGP: 0xA330353E (DSA) or 0xD87D188C (RSA) From: Manuel Tobias Schiller To: Marius Strobl Cc: FreeBSD-gnats-submit@FreeBSD.org Subject: Re: sparc64/142102: FreeBSD 8.0 kernel panics on sparc64 when accessing NFS Date: Wed, 30 Dec 2009 20:36:31 +0100 On Wed, 30 Dec 2009 02:08:01 +0100 Marius Strobl wrote: > On Wed, Dec 30, 2009 at 01:25:55AM +0100, Manuel Tobias Schiller wrote: > > > I managed to compile and test a kernel with your updated patch, and > > things seem to work. I'll try to stress-test it over the night by > > copying a few tens of gigabytes over and report back with the > > results. Also, if I'm reading your patch correctly, we should be able > > to do without the bcopy hack from PR 140797, so I'll recompile a > > kernel without it and try if that works as well. > > I don't think so; nfs_realign() only guarantees 4-byte alignment > as required by XDR and assumed by nfsm_srvmtofh_xx() further down > the road but nfsfh_t needs 8-byte alignment. Of course, you are right. When I tried it, I got my well deserved panic again. I should have looked at the code of nfs_realign much more carefully... Anyway, thanks again! Manuel -- Homepage: http://www.hinterbergen.de/mala OpenPGP: 0xA330353E (DSA) or 0xD87D188C (RSA) From: dfilter@FreeBSD.ORG (dfilter service) To: bug-followup@FreeBSD.org Cc: Subject: Re: sparc64/142102: commit references a PR Date: Sat, 9 Jan 2010 15:31:43 +0000 (UTC) Author: marius Date: Sat Jan 9 15:31:27 2010 New Revision: 201896 URL: http://svn.freebsd.org/changeset/base/201896 Log: Exclude options COMPAT_FREEBSD4 now that the MD freebsd4_sigreturn() is gone since r201396 and which is also in line with the fact that FreeBSD 4 didn't supported sparc64. PR: 142102 (second part) MFC after: 1 week Modified: head/sys/nfsserver/nfs.h head/sys/nfsserver/nfs_fha.c head/sys/nfsserver/nfs_srvkrpc.c Modified: head/sys/nfsserver/nfs.h ============================================================================== --- head/sys/nfsserver/nfs.h Sat Jan 9 14:56:38 2010 (r201895) +++ head/sys/nfsserver/nfs.h Sat Jan 9 15:31:27 2010 (r201896) @@ -240,6 +240,7 @@ extern int nfs_debug; #endif +void nfs_realign(struct mbuf **); struct mbuf *nfs_rephead(int, struct nfsrv_descript *, int, struct mbuf **, caddr_t *); void nfsm_srvfattr(struct nfsrv_descript *, struct vattr *, Modified: head/sys/nfsserver/nfs_fha.c ============================================================================== --- head/sys/nfsserver/nfs_fha.c Sat Jan 9 14:56:38 2010 (r201895) +++ head/sys/nfsserver/nfs_fha.c Sat Jan 9 15:31:27 2010 (r201896) @@ -158,9 +158,9 @@ SYSUNINIT(nfs_fha, SI_SUB_ROOT_CONF, SI_ static void fha_extract_info(struct svc_req *req, struct fha_info *i) { - struct mbuf *md = req->rq_args; + struct mbuf *md; nfsfh_t fh; - caddr_t dpos = mtod(md, caddr_t); + caddr_t dpos; static u_int64_t random_fh = 0; int error; int v3 = (req->rq_vers == 3); @@ -201,6 +201,10 @@ fha_extract_info(struct svc_req *req, st procnum == NFSPROC_NULL) goto out; + nfs_realign(&req->rq_args); + md = req->rq_args; + dpos = mtod(md, caddr_t); + /* Grab the filehandle. */ error = nfsm_srvmtofh_xx(&fh.fh_generic, v3, &md, &dpos); if (error) Modified: head/sys/nfsserver/nfs_srvkrpc.c ============================================================================== --- head/sys/nfsserver/nfs_srvkrpc.c Sat Jan 9 14:56:38 2010 (r201895) +++ head/sys/nfsserver/nfs_srvkrpc.c Sat Jan 9 15:31:27 2010 (r201896) @@ -266,7 +266,7 @@ nfs_rephead(int siz, struct nfsrv_descri * not occur with NFS/UDP and is supposed to only occassionally occur * with TCP. Use vfs.nfs.realign_count and realign_test to check this. */ -static void +void nfs_realign(struct mbuf **pm) /* XXX COMMON */ { struct mbuf *m; _______________________________________________ svn-src-all@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/svn-src-all To unsubscribe, send any mail to "svn-src-all-unsubscribe@freebsd.org" From: dfilter@FreeBSD.ORG (dfilter service) To: bug-followup@FreeBSD.org Cc: Subject: Re: sparc64/142102: commit references a PR Date: Sat, 9 Jan 2010 15:59:24 +0000 (UTC) Author: marius Date: Sat Jan 9 15:59:15 2010 New Revision: 201899 URL: http://svn.freebsd.org/changeset/base/201899 Log: Some style(9) fixes in order to fabricate a commit to denote that the commit message for r201896 actually should have read: As nfsm_srvmtofh_xx() assumes the 4-byte alignment required by XDR ensure the mbuf data is aligned accordingly by calling nfs_realign() in fha_extract_info(). This fix is orthogonal to the problem solved by r199274/r199284. PR: 142102 (second part) MFC after: 1 week Modified: head/sys/nfsserver/nfs.h head/sys/nfsserver/nfs_fha.c head/sys/nfsserver/nfs_srvkrpc.c Modified: head/sys/nfsserver/nfs.h ============================================================================== --- head/sys/nfsserver/nfs.h Sat Jan 9 15:43:47 2010 (r201898) +++ head/sys/nfsserver/nfs.h Sat Jan 9 15:59:15 2010 (r201899) @@ -82,14 +82,13 @@ #define IO_METASYNC 0 #endif - /* NFS state flags XXX -Wunused */ #define NFSRV_SNDLOCK 0x01000000 /* Send socket lock */ #define NFSRV_WANTSND 0x02000000 /* Want above */ /* - * Structures for the nfssvc(2) syscall. Not that anyone but nfsd and mount_nfs - * should ever try and use it. + * Structures for the nfssvc(2) syscall. Not that anyone but nfsd and + * mount_nfs should ever try and use it. */ /* Modified: head/sys/nfsserver/nfs_fha.c ============================================================================== --- head/sys/nfsserver/nfs_fha.c Sat Jan 9 15:43:47 2010 (r201898) +++ head/sys/nfsserver/nfs_fha.c Sat Jan 9 15:59:15 2010 (r201899) @@ -71,16 +71,17 @@ static struct fha_global { u_long hashmask; } g_fha; -/* - * These are the entries in the filehandle hash. They talk about a specific - * file, requests against which are being handled by one or more nfsds. We keep - * a chain of nfsds against the file. We only have more than one if reads are - * ongoing, and then only if the reads affect disparate regions of the file. +/* + * These are the entries in the filehandle hash. They talk about a specific + * file, requests against which are being handled by one or more nfsds. We + * keep a chain of nfsds against the file. We only have more than one if reads + * are ongoing, and then only if the reads affect disparate regions of the + * file. * - * In general, we want to assign a new request to an existing nfsd if it is - * going to contend with work happening already on that nfsd, or if the - * operation is a read and the nfsd is already handling a proximate read. We - * do this to avoid jumping around in the read stream unnecessarily, and to + * In general, we want to assign a new request to an existing nfsd if it is + * going to contend with work happening already on that nfsd, or if the + * operation is a read and the nfsd is already handling a proximate read. We + * do this to avoid jumping around in the read stream unnecessarily, and to * avoid contention between threads over single files. */ struct fha_hash_entry { @@ -101,7 +102,7 @@ struct fha_info { }; static int fhe_stats_sysctl(SYSCTL_HANDLER_ARGS); - + static void nfs_fha_init(void *foo) { @@ -136,7 +137,7 @@ nfs_fha_init(void *foo) &fha_ctls.max_reqs_per_nfsd, 0, "Maximum requests that " "single nfsd thread should be working on at any time"); - SYSCTL_ADD_OID(&fha_clist, SYSCTL_STATIC_CHILDREN(_vfs_nfsrv_fha), + SYSCTL_ADD_OID(&fha_clist, SYSCTL_STATIC_CHILDREN(_vfs_nfsrv_fha), OID_AUTO, "fhe_stats", CTLTYPE_STRING | CTLFLAG_RD, 0, 0, fhe_stats_sysctl, "A", ""); } @@ -151,7 +152,7 @@ nfs_fha_uninit(void *foo) SYSINIT(nfs_fha, SI_SUB_ROOT_CONF, SI_ORDER_ANY, nfs_fha_init, NULL); SYSUNINIT(nfs_fha, SI_SUB_ROOT_CONF, SI_ORDER_ANY, nfs_fha_uninit, NULL); -/* +/* * This just specifies that offsets should obey affinity when within * the same 1Mbyte (1<<20) chunk for the file (reads only for now). */ @@ -167,18 +168,18 @@ fha_extract_info(struct svc_req *req, st u_int32_t *tl; rpcproc_t procnum; - /* - * We start off with a random fh. If we get a reasonable - * procnum, we set the fh. If there's a concept of offset + /* + * We start off with a random fh. If we get a reasonable + * procnum, we set the fh. If there's a concept of offset * that we're interested in, we set that. */ i->fh = ++random_fh; i->offset = 0; i->locktype = LK_EXCLUSIVE; - + /* * Extract the procnum and convert to v3 form if necessary, - * taking care to deal with out-of-range procnums. Caller will + * taking care to deal with out-of-range procnums. Caller will * ensure that rq_vers is either 2 or 3. */ procnum = req->rq_proc; @@ -188,19 +189,19 @@ fha_extract_info(struct svc_req *req, st procnum = nfsrv_nfsv3_procid[procnum]; } - /* - * We do affinity for most. However, we divide a realm of affinity - * by file offset so as to allow for concurrent random access. We - * only do this for reads today, but this may change when IFS supports + /* + * We do affinity for most. However, we divide a realm of affinity + * by file offset so as to allow for concurrent random access. We + * only do this for reads today, but this may change when IFS supports * efficient concurrent writes. */ if (procnum == NFSPROC_FSSTAT || procnum == NFSPROC_FSINFO || procnum == NFSPROC_PATHCONF || - procnum == NFSPROC_NOOP || + procnum == NFSPROC_NOOP || procnum == NFSPROC_NULL) goto out; - + nfs_realign(&req->rq_args); md = req->rq_args; dpos = mtod(md, caddr_t); @@ -270,8 +271,8 @@ fha_hash_entry_new(u_int64_t fh) e->num_writes = 0; e->num_threads = 0; LIST_INIT(&e->threads); - - return e; + + return (e); } static void @@ -296,10 +297,9 @@ fha_hash_entry_lookup(SVCPOOL *pool, u_i { struct fha_hash_entry *fhe, *new_fhe; - LIST_FOREACH(fhe, &g_fha.hashtable[fh % g_fha.hashmask], link) { + LIST_FOREACH(fhe, &g_fha.hashtable[fh % g_fha.hashmask], link) if (fhe->fh == fh) break; - } if (!fhe) { /* Allocate a new entry. */ @@ -308,25 +308,24 @@ fha_hash_entry_lookup(SVCPOOL *pool, u_i mtx_lock(&pool->sp_lock); /* Double-check to make sure we still need the new entry. */ - LIST_FOREACH(fhe, &g_fha.hashtable[fh % g_fha.hashmask], link) { + LIST_FOREACH(fhe, &g_fha.hashtable[fh % g_fha.hashmask], link) if (fhe->fh == fh) break; - } if (!fhe) { fhe = new_fhe; LIST_INSERT_HEAD(&g_fha.hashtable[fh % g_fha.hashmask], fhe, link); - } else { + } else fha_hash_entry_destroy(new_fhe); - } } - return fhe; + return (fhe); } static void fha_hash_entry_add_thread(struct fha_hash_entry *fhe, SVCTHREAD *thread) { + LIST_INSERT_HEAD(&fhe->threads, thread, st_alink); fhe->num_threads++; } @@ -339,7 +338,7 @@ fha_hash_entry_remove_thread(struct fha_ fhe->num_threads--; } -/* +/* * Account for an ongoing operation associated with this file. */ static void @@ -365,7 +364,7 @@ get_idle_thread(SVCPOOL *pool) } -/* +/* * Get the service thread currently associated with the fhe that is * appropriate to handle this operation. */ @@ -387,15 +386,15 @@ fha_hash_entry_choose_thread(SVCPOOL *po /* If there are any writes in progress, use the first thread. */ if (fhe->num_writes) { #if 0 - ITRACE_CURPROC(ITRACE_NFS, ITRACE_INFO, + ITRACE_CURPROC(ITRACE_NFS, ITRACE_INFO, "fha: %p(%d)w", thread, req_count); #endif return (thread); } - /* - * Check for read locality, making sure that we won't - * exceed our per-thread load limit in the process. + /* + * Check for read locality, making sure that we won't + * exceed our per-thread load limit in the process. */ offset1 = i->offset >> fha_ctls.bin_shift; offset2 = STAILQ_FIRST(&thread->st_reqs)->rq_p3 @@ -404,21 +403,21 @@ fha_hash_entry_choose_thread(SVCPOOL *po if ((fha_ctls.max_reqs_per_nfsd == 0) || (req_count < fha_ctls.max_reqs_per_nfsd)) { #if 0 - ITRACE_CURPROC(ITRACE_NFS, ITRACE_INFO, + ITRACE_CURPROC(ITRACE_NFS, ITRACE_INFO, "fha: %p(%d)r", thread, req_count); #endif return (thread); } } - /* + /* * We don't have a locality match, so skip this thread, - * but keep track of the most attractive thread in case + * but keep track of the most attractive thread in case * we need to come back to it later. */ #if 0 - ITRACE_CURPROC(ITRACE_NFS, ITRACE_INFO, - "fha: %p(%d)s off1 %llu off2 %llu", thread, + ITRACE_CURPROC(ITRACE_NFS, ITRACE_INFO, + "fha: %p(%d)s off1 %llu off2 %llu", thread, req_count, offset1, offset2); #endif if ((min_thread == NULL) || (req_count < min_count)) { @@ -427,38 +426,38 @@ fha_hash_entry_choose_thread(SVCPOOL *po } } - /* - * We didn't find a good match yet. See if we can add + /* + * We didn't find a good match yet. See if we can add * a new thread to this file handle entry's thread list. */ - if ((fha_ctls.max_nfsds_per_fh == 0) || + if ((fha_ctls.max_nfsds_per_fh == 0) || (fhe->num_threads < fha_ctls.max_nfsds_per_fh)) { - /* - * We can add a new thread, so try for an idle thread - * first, and fall back to this_thread if none are idle. + /* + * We can add a new thread, so try for an idle thread + * first, and fall back to this_thread if none are idle. */ if (STAILQ_EMPTY(&this_thread->st_reqs)) { thread = this_thread; #if 0 - ITRACE_CURPROC(ITRACE_NFS, ITRACE_INFO, + ITRACE_CURPROC(ITRACE_NFS, ITRACE_INFO, "fha: %p(%d)t", thread, thread->st_reqcount); #endif } else if ((thread = get_idle_thread(pool))) { #if 0 - ITRACE_CURPROC(ITRACE_NFS, ITRACE_INFO, + ITRACE_CURPROC(ITRACE_NFS, ITRACE_INFO, "fha: %p(%d)i", thread, thread->st_reqcount); #endif - } else { + } else { thread = this_thread; #if 0 - ITRACE_CURPROC(ITRACE_NFS, ITRACE_INFO, + ITRACE_CURPROC(ITRACE_NFS, ITRACE_INFO, "fha: %p(%d)b", thread, thread->st_reqcount); #endif } fha_hash_entry_add_thread(fhe, thread); } else { - /* - * We don't want to use any more threads for this file, so + /* + * We don't want to use any more threads for this file, so * go back to the most attractive nfsd we're already using. */ thread = min_thread; @@ -467,8 +466,8 @@ fha_hash_entry_choose_thread(SVCPOOL *po return (thread); } -/* - * After getting a request, try to assign it to some thread. Usually we +/* + * After getting a request, try to assign it to some thread. Usually we * handle it ourselves. */ SVCTHREAD * @@ -491,16 +490,16 @@ fha_assign(SVCTHREAD *this_thread, struc pool = req->rq_xprt->xp_pool; fha_extract_info(req, &i); - /* - * We save the offset associated with this request for later + /* + * We save the offset associated with this request for later * nfsd matching. */ fhe = fha_hash_entry_lookup(pool, i.fh); req->rq_p1 = fhe; req->rq_p2 = i.locktype; req->rq_p3 = i.offset; - - /* + + /* * Choose a thread, taking into consideration locality, thread load, * and the number of threads already working on this file. */ @@ -511,8 +510,8 @@ fha_assign(SVCTHREAD *this_thread, struc return (thread); } -/* - * Called when we're done with an operation. The request has already +/* + * Called when we're done with an operation. The request has already * been de-queued. */ void Modified: head/sys/nfsserver/nfs_srvkrpc.c ============================================================================== --- head/sys/nfsserver/nfs_srvkrpc.c Sat Jan 9 15:43:47 2010 (r201898) +++ head/sys/nfsserver/nfs_srvkrpc.c Sat Jan 9 15:59:15 2010 (r201899) @@ -187,19 +187,18 @@ nfssvc_nfsserver(struct thread *td, stru } error = nfssvc_addsock(fp, td); fdrop(fp, td); - } else if (uap->flag & NFSSVC_OLDNFSD) { + } else if (uap->flag & NFSSVC_OLDNFSD) error = nfssvc_nfsd(td, NULL); - } else if (uap->flag & NFSSVC_NFSD) { - if (!uap->argp) + else if (uap->flag & NFSSVC_NFSD) { + if (!uap->argp) return (EINVAL); error = copyin(uap->argp, (caddr_t)&nfsdarg, sizeof(nfsdarg)); if (error) return (error); error = nfssvc_nfsd(td, &nfsdarg); - } else { + } else error = ENXIO; - } return (error); } @@ -447,9 +446,8 @@ nfssvc_addsock(struct file *fp, struct t siz = sb_max_adj; error = soreserve(so, siz, siz); - if (error) { + if (error) return (error); - } /* * Steal the socket from userland so that it doesn't close @@ -471,7 +469,7 @@ nfssvc_addsock(struct file *fp, struct t } /* - * Called by nfssvc() for nfsds. Just loops around servicing rpc requests + * Called by nfssvc() for nfsds. Just loops around servicing rpc requests * until it is killed by a signal. */ static int @@ -496,9 +494,9 @@ nfssvc_nfsd(struct thread *td, struct nf #endif /* - * Only the first nfsd actually does any work. The RPC code - * adds threads to it as needed. Any extra processes offered - * by nfsd just exit. If nfsd is new enough, it will call us + * Only the first nfsd actually does any work. The RPC code + * adds threads to it as needed. Any extra processes offered + * by nfsd just exit. If nfsd is new enough, it will call us * once with a structure that specifies how many threads to * use. */ @@ -522,7 +520,7 @@ nfssvc_nfsd(struct thread *td, struct nf nfsrv_pool->sp_minthreads = 4; nfsrv_pool->sp_maxthreads = 4; } - + svc_run(nfsrv_pool); #ifdef KGSSAPI @@ -541,7 +539,7 @@ nfssvc_nfsd(struct thread *td, struct nf /* * Size the NFS server's duplicate request cache at 1/2 the - * nmbclusters, floating within a (64, 2048) range. This is to + * nmbclusters, floating within a (64, 2048) range. This is to * prevent all mbuf clusters being tied up in the NFS dupreq * cache for small values of nmbclusters. */ _______________________________________________ svn-src-all@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/svn-src-all To unsubscribe, send any mail to "svn-src-all-unsubscribe@freebsd.org" From: dfilter@FreeBSD.ORG (dfilter service) To: bug-followup@FreeBSD.org Cc: Subject: Re: sparc64/142102: commit references a PR Date: Sat, 16 Jan 2010 12:16:53 +0000 (UTC) Author: marius Date: Sat Jan 16 12:16:38 2010 New Revision: 202438 URL: http://svn.freebsd.org/changeset/base/202438 Log: MFC: r201896 As nfsm_srvmtofh_xx() assumes the 4-byte alignment required by XDR ensure the mbuf data is aligned accordingly by calling nfs_realign() in fha_extract_info(). This fix is orthogonal to the problem solved by r199274/r199284 (MFC'ed to stable/8 in r199733). PR: 142102 (second part) Modified: stable/8/sys/nfsserver/nfs.h stable/8/sys/nfsserver/nfs_fha.c stable/8/sys/nfsserver/nfs_srvkrpc.c Directory Properties: stable/8/sys/ (props changed) stable/8/sys/amd64/include/xen/ (props changed) stable/8/sys/cddl/contrib/opensolaris/ (props changed) stable/8/sys/contrib/dev/acpica/ (props changed) stable/8/sys/contrib/pf/ (props changed) stable/8/sys/dev/xen/xenpci/ (props changed) Modified: stable/8/sys/nfsserver/nfs.h ============================================================================== --- stable/8/sys/nfsserver/nfs.h Sat Jan 16 09:52:49 2010 (r202437) +++ stable/8/sys/nfsserver/nfs.h Sat Jan 16 12:16:38 2010 (r202438) @@ -240,6 +240,7 @@ extern int nfs_debug; #endif +void nfs_realign(struct mbuf **); struct mbuf *nfs_rephead(int, struct nfsrv_descript *, int, struct mbuf **, caddr_t *); void nfsm_srvfattr(struct nfsrv_descript *, struct vattr *, Modified: stable/8/sys/nfsserver/nfs_fha.c ============================================================================== --- stable/8/sys/nfsserver/nfs_fha.c Sat Jan 16 09:52:49 2010 (r202437) +++ stable/8/sys/nfsserver/nfs_fha.c Sat Jan 16 12:16:38 2010 (r202438) @@ -158,9 +158,9 @@ SYSUNINIT(nfs_fha, SI_SUB_ROOT_CONF, SI_ static void fha_extract_info(struct svc_req *req, struct fha_info *i) { - struct mbuf *md = req->rq_args; + struct mbuf *md; nfsfh_t fh; - caddr_t dpos = mtod(md, caddr_t); + caddr_t dpos; static u_int64_t random_fh = 0; int error; int v3 = (req->rq_vers == 3); @@ -201,6 +201,10 @@ fha_extract_info(struct svc_req *req, st procnum == NFSPROC_NULL) goto out; + nfs_realign(&req->rq_args); + md = req->rq_args; + dpos = mtod(md, caddr_t); + /* Grab the filehandle. */ error = nfsm_srvmtofh_xx(&fh.fh_generic, v3, &md, &dpos); if (error) Modified: stable/8/sys/nfsserver/nfs_srvkrpc.c ============================================================================== --- stable/8/sys/nfsserver/nfs_srvkrpc.c Sat Jan 16 09:52:49 2010 (r202437) +++ stable/8/sys/nfsserver/nfs_srvkrpc.c Sat Jan 16 12:16:38 2010 (r202438) @@ -266,7 +266,7 @@ nfs_rephead(int siz, struct nfsrv_descri * not occur with NFS/UDP and is supposed to only occassionally occur * with TCP. Use vfs.nfs.realign_count and realign_test to check this. */ -static void +void nfs_realign(struct mbuf **pm) /* XXX COMMON */ { struct mbuf *m; _______________________________________________ svn-src-all@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/svn-src-all To unsubscribe, send any mail to "svn-src-all-unsubscribe@freebsd.org" State-Changed-From-To: open->closed State-Changed-By: marius State-Changed-When: Mon Dec 12 19:00:33 UTC 2011 State-Changed-Why: Close as this PR has been fully dealt with. It was just kept open due to a possible release errata, which then turned out to be infeasible. http://www.freebsd.org/cgi/query-pr.cgi?pr=142102 >Unformatted: