From nobody@FreeBSD.org Tue Mar 2 09:31:35 2004 Return-Path: Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 6790516A4CE for ; Tue, 2 Mar 2004 09:31:35 -0800 (PST) Received: from www.freebsd.org (www.freebsd.org [216.136.204.117]) by mx1.FreeBSD.org (Postfix) with ESMTP id 4AEDB43D2D for ; Tue, 2 Mar 2004 09:31:35 -0800 (PST) (envelope-from nobody@FreeBSD.org) Received: from www.freebsd.org (localhost [127.0.0.1]) by www.freebsd.org (8.12.10/8.12.10) with ESMTP id i22HVZ72062240 for ; Tue, 2 Mar 2004 09:31:35 -0800 (PST) (envelope-from nobody@www.freebsd.org) Received: (from nobody@localhost) by www.freebsd.org (8.12.10/8.12.10/Submit) id i22HVZ0J062239; Tue, 2 Mar 2004 09:31:35 -0800 (PST) (envelope-from nobody) Message-Id: <200403021731.i22HVZ0J062239@www.freebsd.org> Date: Tue, 2 Mar 2004 09:31:35 -0800 (PST) From: Edmond Baroud To: freebsd-gnats-submit@FreeBSD.org Subject: nfsd crashes system X-Send-Pr-Version: www-2.3 >Number: 63649 >Category: kern >Synopsis: [nfs] nfsd crashes system >Confidential: no >Severity: serious >Priority: medium >Responsible: freebsd-bugs >State: closed >Quarter: >Keywords: >Date-Required: >Class: sw-bug >Submitter-Id: current-users >Arrival-Date: Tue Mar 02 09:40:11 PST 2004 >Closed-Date: Sun Sep 05 07:25:26 GMT 2004 >Last-Modified: Sun Sep 05 07:25:26 GMT 2004 >Originator: Edmond Baroud >Release: 5.2.1-REL >Organization: N/A >Environment: FreeBSD xxx.xxx.xxx 5.2.1-RELEASE FreeBSD 5.2.1-RELEASE #0: Wed Feb 25 15:38:11 EST 2004 root@xxx.xxx.xxx:/usr/src/sys/i386/compile/NEO i386 >Description: shared /cdrom in /etc/exports: /cdrom -ro -mapall=nobody ran: #nfsd -u -t -n 4 #mountd -r #/usr/sbin/rpcbind #/usr/sbin/rpc.statd #/usr/sbin/rpc.lockd now from another box, when I try to nfs mount: #mount -t nfs myhost:/cdrom /some_dir my system hangs for 5-10 secs then reboots. Here's the kernel dump: Mar 2 09:34:08 neo kernel: cd9660: Joliet Extension (Level 1) Mar 2 09:37:00 neo syslogd: kernel boot file is /boot/kernel/kernel Mar 2 09:37:00 neo kernel: fhtovp: lbn exceed volume space 0 Mar 2 09:37:00 neo kernel: Mar 2 09:37:00 neo kernel: Mar 2 09:37:00 neo kernel: Fatal trap 12: page fault while in kernel mode Mar 2 09:37:00 neo kernel: cpuid = 0; apic id = 00 Mar 2 09:37:00 neo kernel: fault virtual address = 0x1c Mar 2 09:37:00 neo kernel: fault code = supervisor write, page not present Mar 2 09:37:00 neo kernel: instruction pointer = 0x8:0xc05889f5 Mar 2 09:37:00 neo kernel: stack pointer = 0x10:0xe767b8ac Mar 2 09:37:00 neo kernel: frame pointer = 0x10:0xe767b8bc Mar 2 09:37:00 neo kernel: code segment = base 0x0, limit 0xfffff, type 0x1b Mar 2 09:37:00 neo kernel: = DPL 0, pres 1, def32 1, gran 1 Mar 2 09:37:00 neo kernel: processor eflags = interrupt enabled, resume, IOPL = 0 Mar 2 09:37:00 neo kernel: current process = 418 (nfsd) Mar 2 09:37:00 neo kernel: trap number = 12 Mar 2 09:37:00 neo kernel: panic: page fault Mar 2 09:37:00 neo kernel: cpuid = 0; Mar 2 09:37:00 neo kernel: Mar 2 09:37:00 neo kernel: syncing disks, buffers remaining... 5101 5101 5098 5098 5098 5098 5098 5098 5098 5098 5098 5098 50 98 5098 5098 5098 5098 5098 5098 5098 5098 5098 Mar 2 09:37:00 neo kernel: Copyright (c) 1992-2004 The FreeBSD Project. Mar 2 09:37:00 neo kernel: Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 Mar 2 09:37:00 neo kernel: The Regents of the University of California. All rights reserved. Mar 2 09:37:00 neo kernel: FreeBSD 5.2.1-RELEASE #0: Wed Feb 25 15:38:11 EST 2004 My pc is a HP D530 supposedly with hypertheading, so I built a SMP kernel, but it looks like hyperthreading is deactivated or not functionning on my system. It's running an old BIOS version, locked BIOS too(company policy..). I also have acpi disabled (in /boot/, it caused me some problems with my bge on board network interface (keeps resetting) >How-To-Repeat: mount -t nfs myhost:/cdrom /some_dir >Fix: >Release-Note: >Audit-Trail: From: Edmond Baroud To: freebsd-gnats-submit@FreeBSD.org Cc: Subject: Re: kern/63649: nfsd crashes system Date: Tue, 2 Mar 2004 14:28:20 -0500 I have rebooted my box to test othet stuff and gived this nfs mount a try and I couldn't reproduce the problem. The only change I can see was in /etc/exports: /cdrom -ro instead of: /cdrom -ro -mapall=nobody If you guys need more info/logs let me know. Edmond. State-Changed-From-To: open->feedback State-Changed-By: kris State-Changed-When: Sat Mar 6 00:53:27 PST 2004 State-Changed-Why: Traceback requested http://www.freebsd.org/cgi/query-pr.cgi?pr=63649 From: Kris Kennaway To: Edmond Baroud Cc: freebsd-gnats-submit@FreeBSD.org Subject: Re: kern/63649: nfsd crashes system Date: Sat, 6 Mar 2004 00:52:47 -0800 On Tue, Mar 02, 2004 at 11:30:18AM -0800, Edmond Baroud wrote: > The following reply was made to PR kern/63649; it has been noted by GNATS. > > From: Edmond Baroud > To: freebsd-gnats-submit@FreeBSD.org > Cc: > Subject: Re: kern/63649: nfsd crashes system > Date: Tue, 2 Mar 2004 14:28:20 -0500 > > I have rebooted my box to test othet stuff and gived this nfs mount a try and > I couldn't reproduce the problem. The only change I can see was > in /etc/exports: > /cdrom -ro > instead of: > /cdrom -ro -mapall=nobody > > If you guys need more info/logs let me know. Please verify that you do not have stale kernel modules installed, and obtain a debugging traceback as described in http://www.freebsd.org/doc/en_US.ISO8859-1/books/developers-handbook/kerneldebug.html If you're unable to reproduce this, the PR should probably just be closed. Kris From: Edmond Baroud To: Kris Kennaway Cc: freebsd-gnats-submit@FreeBSD.org Subject: Re: kern/63649: nfsd crashes system Date: Mon, 8 Mar 2004 12:49:16 -0500 Hi Kris, I was able to reproduce the problem, but I'm no developper so when it comes to using gdb I suck :)I'm sure nobody wants me to send 2 x 1G core files, so could you please tell me what gdb options you want me to submit? Updates on debugging/investigation: - The crash only happens when mounting Joliette extensions Level "1". Tried Level 3 and Rockridge and both worked well, no crash!. - I have 2 core files now, one from (nfsd), and the other from (nfsd) AND (g_up)? see below: --8<--cut-here--8<-- Fatal trap 12: page fault while in kernel mode cpuid = 0; apic id = 00 fault virtual address = 0x1c fault code = supervisor write, page not present instruction pointer = 0x8:0xc05889f5 stack pointer = 0x10:0xdf5848ac frame pointer = 0x10:0xdf5848bc code segment = base 0x0, limit 0xfffff, type 0x1b = DPL 0, pres 1, def32 1, gran 1 processor eflags = interrupt enabled, resume, IOPL = 0 current process = 427 (nfsd) trap number = 12 panic: page fault cpuid = 0; syncing disks, buffers remaining... 246 246 239 panic: bremfree: removing a buffer not on a queue cpuid = 0; Uptime: 4m30s kernel trap 12 with interrupts disabled Fatal trap 12: page fault while in kernel mode cpuid = 0; apic id = 00 fault virtual address = 0x24 fault code = supervisor read, page not present instruction pointer = 0x8:0xc05b4ade stack pointer = 0x10:0xde2f5ab8 frame pointer = 0x10:0xde2f5adc code segment = base 0x0, limit 0xfffff, type 0x1b = DPL 0, pres 1, def32 1, gran 1 processor eflags = resume, IOPL = 0 current process = 3 (g_up) trap number = 12 panic: page fault cpuid = 0; Uptime: 4m30s Dumping 1016 MB 16 32 48 64 80 96 112 128 144 160 176 192 208 224 240 256 272 288 304 320 336 352 368 384 400 416 432 448 464 480 496 512 528 544 560 576 592 608 624 640 656 672 688 704 720 736 752 768 784 800 816 832 848 864 880 896 912 928 944 960 976 992 1008 --- Reading symbols from /boot/kernel/fade_saver.ko...(no debugging symbols found)...done. Loaded symbols for /boot/kernel/fade_saver.ko Reading symbols from /usr/src/sys/i386/compile/NEO/modules/usr/src/sys/modules/linux/linux.ko.debug...done. Loaded symbols for /usr/src/sys/i386/compile/NEO/modules/usr/src/sys/modules/linux/linux.ko.debug #0 0xc059113b in doadump () (kgdb) list *0xc05b4ade No source file for address 0xc05b4ade. (kgdb) up 10 #10 0xc0591a0e in panic () (kgdb) q --8<--cut-here--8<-- Some debugging I tried from the gdb instruction page: 1) root@neo:src/sys/NEO/ > pwd /usr/obj/usr/src/sys/NEO root@neo:src/sys/NEO/ > gdb -k kernel.debug /var/crash/vmcore.0 GNU gdb 5.2.1 (FreeBSD) Copyright 2002 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "i386-unknown-freebsd"... /var/crash/vmcore.0: Unknown error: 0. (kgdb) q 2) root@neo:src/sys/NEO/ > gdb -k /boot/kernel/kernel /var/crash/vmcore.0 GNU gdb 5.2.1 (FreeBSD) Copyright 2002 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "i386-unknown-freebsd"...(no debugging symbols found)... panic: page fault panic messages: --- Fatal trap 12: page fault while in kernel mode cpuid = 0; apic id = 00 fault virtual address = 0x1c fault code = supervisor write, page not present instruction pointer = 0x8:0xc05889f5 stack pointer = 0x10:0xdf5878ac frame pointer = 0x10:0xdf5878bc code segment = base 0x0, limit 0xfffff, type 0x1b = DPL 0, pres 1, def32 1, gran 1 processor eflags = interrupt enabled, resume, IOPL = 0 current process = 427 (nfsd) trap number = 12 panic: page fault cpuid = 0; syncing disks, buffers remaining... 7068 7068 7065 7064 7064 7064 7064 7064 7064 7064 7064 7064 7064 7064 7064 7064 7064 7064 7064 7064 7064 7064 7064 giving up on 6944 buffers Uptime: 3m45s Dumping 1016 MB 16 32 48 64 80 96 112 128 144 160 176 192 208 224 240 256 272 288 304 320 336 352 368 384 400 416 432 448 464 480 496 512 528 544 560 576 592 608 624 640 656 672 688 704 720 736 752 768 784 800 816 832 848 864 880 896 912 928 944 960 976 992 1008 --- Reading symbols from /boot/kernel/fade_saver.ko...(no debugging symbols found)...done. Loaded symbols for /boot/kernel/fade_saver.ko Reading symbols from /usr/src/sys/i386/compile/NEO/modules/usr/src/sys/modules/linux/linux.ko.debug...done. Loaded symbols for /usr/src/sys/i386/compile/NEO/modules/usr/src/sys/modules/linux/linux.ko.debug #0 0xc059113b in doadump () (kgdb) list *0xc05889f5 No source file for address 0xc05889f5. (kgdb) backtrace #0 0xc059113b in doadump () #1 0xc0591697 in boot () #2 0xc0591a0e in panic () #3 0xc07437dc in trap_fatal () #4 0xc0743482 in trap_pfault () #5 0xc07430ad in trap () #6 0xc07307c8 in calltrap () #7 0xc05e9856 in vput () #8 0xc06ac8dc in nfsrv_readdirplus () #9 0xc06b1bba in nfssvc_nfsd () #10 0xc06b158d in nfssvc () #11 0xc0743b20 in syscall () #12 0xc073081d in Xint0x80_syscall () ---Can't read userspace from dump, or kernel process--- (kgdb) up 10 #10 0xc06b158d in nfssvc () (kgdb) list 1 {standard input}: No such file or directory. in {standard input} (kgdb) q On March 6, 2004 03:52 am, Kris Kennaway wrote: > On Tue, Mar 02, 2004 at 11:30:18AM -0800, Edmond Baroud wrote: > > The following reply was made to PR kern/63649; it has been noted by > > GNATS. > > > > From: Edmond Baroud > > To: freebsd-gnats-submit@FreeBSD.org > > Cc: > > Subject: Re: kern/63649: nfsd crashes system > > Date: Tue, 2 Mar 2004 14:28:20 -0500 > > > > I have rebooted my box to test othet stuff and gived this nfs mount a > > try and I couldn't reproduce the problem. The only change I can see was > > in /etc/exports: > > /cdrom -ro > > instead of: > > /cdrom -ro -mapall=nobody > > > > If you guys need more info/logs let me know. > > Please verify that you do not have stale kernel modules installed, and > obtain a debugging traceback as described in > > http://www.freebsd.org/doc/en_US.ISO8859-1/books/developers-handbook/kernel >debug.html > > If you're unable to reproduce this, the PR should probably just be closed. > > Kris From: Kris Kennaway To: Edmond Baroud Cc: Kris Kennaway , freebsd-gnats-submit@FreeBSD.org Subject: Re: kern/63649: nfsd crashes system Date: Mon, 8 Mar 2004 13:40:20 -0800 On Mon, Mar 08, 2004 at 12:49:16PM -0500, Edmond Baroud wrote: > Hi Kris, > > I was able to reproduce the problem, but I'm no developper so when it comes to > using gdb I suck :)I'm sure nobody wants me to send 2 x 1G core files, so > could you please tell me what gdb options you want me to submit? I gave you the URL that explains how to compile your kernel with debugging symbols and extract a full traceback with gdb -k. Kris From: Kris Kennaway To: Edmond Baroud Cc: freebsd-gnats-submit@FreeBSD.org Subject: Re: kern/63649: nfsd crashes system Date: Tue, 9 Mar 2004 13:14:03 -0800 On Tue, Mar 09, 2004 at 08:58:19AM -0500, Edmond Baroud wrote: > My kernel is already compiled with "makeoptions DEBUG=-g" > the only info I could extract is what I sent in the last email. > I'm unable to get more "traceback" from gdb. Which kernel did you run gdb against: the installed one (which has the debug symbols stripped out and won't work), or the kernel.debug in your kernel build directory? Kris P.S. Please don't top-post, and don't drop the CC list or the correspondence won't be archived in your PR so that others can help you with it as well. From: Edmond Baroud To: Kris Kennaway Cc: freebsd-gnats-submit@FreeBSD.org Subject: Re: kern/63649: nfsd crashes system Date: Wed, 10 Mar 2004 10:49:55 -0500 against both, the stripped one doesnt show any symbols of course, but when I run it on the kernel.debug one I get this, and I sent it in my last email: 10:42 root@neo:src/sys/NEO/ > pwd /usr/obj/usr/src/sys/NEO 10:42 root@neo:src/sys/NEO/ > gdb -k kernel.debug /var/crash/vmcore.0 GNU gdb 5.2.1 (FreeBSD) Copyright 2002 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "i386-unknown-freebsd"... /var/crash/vmcore.0: Unknown error: 0. (kgdb) where No stack. (kgdb) backtrace No stack. On March 9, 2004 04:14 pm, Kris Kennaway wrote: > On Tue, Mar 09, 2004 at 08:58:19AM -0500, Edmond Baroud wrote: > > My kernel is already compiled with "makeoptions DEBUG=-g" > > the only info I could extract is what I sent in the last email. > > I'm unable to get more "traceback" from gdb. > > Which kernel did you run gdb against: the installed one (which has the > debug symbols stripped out and won't work), or the kernel.debug in > your kernel build directory? > > Kris > > P.S. Please don't top-post, and don't drop the CC list or the > correspondence won't be archived in your PR so that others can help > you with it as well. State-Changed-From-To: feedback->closed State-Changed-By: tjr State-Changed-When: Sun Sep 5 07:25:01 GMT 2004 State-Changed-Why: Duplicate of kern/63446. http://www.freebsd.org/cgi/query-pr.cgi?pr=63649 >Unformatted: