From nobody@FreeBSD.org Tue Oct 21 21:31:31 2008 Return-Path: Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id BB478106566B for ; Tue, 21 Oct 2008 21:31:31 +0000 (UTC) (envelope-from nobody@FreeBSD.org) Received: from www.freebsd.org (www.freebsd.org [IPv6:2001:4f8:fff6::21]) by mx1.freebsd.org (Postfix) with ESMTP id AA8BE8FC17 for ; Tue, 21 Oct 2008 21:31:31 +0000 (UTC) (envelope-from nobody@FreeBSD.org) Received: from www.freebsd.org (localhost [127.0.0.1]) by www.freebsd.org (8.14.3/8.14.3) with ESMTP id m9LLVVbm029888 for ; Tue, 21 Oct 2008 21:31:31 GMT (envelope-from nobody@www.freebsd.org) Received: (from nobody@localhost) by www.freebsd.org (8.14.3/8.14.3/Submit) id m9LLVVk3029887; Tue, 21 Oct 2008 21:31:31 GMT (envelope-from nobody) Message-Id: <200810212131.m9LLVVk3029887@www.freebsd.org> Date: Tue, 21 Oct 2008 21:31:31 GMT From: Dwayne Hart To: freebsd-gnats-submit@FreeBSD.org Subject: system failure on removing two drives X-Send-Pr-Version: www-3.1 X-GNATS-Notify: >Number: 128282 >Category: kern >Synopsis: [mpt] system failure on removing two drives >Confidential: no >Severity: serious >Priority: low >Responsible: gavin >State: feedback >Quarter: >Keywords: >Date-Required: >Class: sw-bug >Submitter-Id: current-users >Arrival-Date: Tue Oct 21 21:40:00 UTC 2008 >Closed-Date: >Last-Modified: Sun Aug 14 23:40:04 UTC 2011 >Originator: Dwayne Hart >Release: Production (Legacy) Release 6.3 >Organization: SSI Micro Ltd. >Environment: FreeBSD eggo.ssimicro.com 6.3-RELEASE FreeBSD 6.3-RELEASE #0: Wed Jan 16 04:45:45 UTC 2008 root@dessler.cse.buffalo.edu:/usr/obj/usr/src/sys/SMP i386 >Description: While testing out new hardware. I removed two hot swappable drives from the functioning 6.3 server. One drive was part of a software raid system while the other was not in use by the system. As a test of the hardware controller I put each drive in the others slot. The machine was non responsive. I had to use the reset key in order to bring the system back on its feet. The OS performed a file system check of the various drives and started rebuilding the software raid array marking the removed drive as 'dirty' which had been part of a functioning array. Which was to be expected. I'm not sure if the problem lies with the mpt0 driver? >How-To-Repeat: Remove two disk from an operational system at the same time. >Fix: >Release-Note: >Audit-Trail: State-Changed-From-To: open->feedback State-Changed-By: gavin State-Changed-When: Thu Oct 23 15:10:52 UTC 2008 State-Changed-Why: To submitter: could you please provide a dmesg from this system? Responsible-Changed-From-To: freebsd-i386->gavin Responsible-Changed-By: gavin Responsible-Changed-When: Thu Oct 23 15:10:52 UTC 2008 Responsible-Changed-Why: Track http://www.freebsd.org/cgi/query-pr.cgi?pr=128282 From: Gavin Atkinson To: bug-followup@FreeBSD.org Cc: Subject: Re: kern/128282: [mpt] system failure on removing two drives Date: Mon, 23 Feb 2009 16:45:32 +0000 -------- Forwarded Message -------- From: Dwayne Hart Subject: Re: kern/128282: [mpt] system failure on removing two drives Hi, As per your request, I've attached a verbose copy of the machine's dmesg. Cheers, Dwayne plain text document attachment (dmesg.23-oct-2008) Copyright (c) 1992-2008 The FreeBSD Project. Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 The Regents of the University of California. All rights reserved. FreeBSD is a registered trademark of The FreeBSD Foundation. FreeBSD 6.3-RELEASE #0: Wed Jan 16 04:45:45 UTC 2008 root@dessler.cse.buffalo.edu:/usr/obj/usr/src/sys/SMP acpi_alloc_wakeup_handler: can't alloc wake memory Timecounter "i8254" frequency 1193182 Hz quality 0 CPU: Intel(R) Xeon(R) CPU E5405 @ 2.00GHz (2000.08-MHz 686-class CPU) Origin = "GenuineIntel" Id = 0x10676 Stepping = 6 Features=0xbfebfbff Features2=0xce33d> AMD Features=0x20100000 AMD Features2=0x1 Cores per package: 4 real memory = 3488940032 (3327 MB) avail memory = 3409944576 (3251 MB) ACPI APIC Table: FreeBSD/SMP: Multiprocessor System Detected: 8 CPUs cpu0 (BSP): APIC ID: 0 cpu1 (AP): APIC ID: 1 cpu2 (AP): APIC ID: 2 cpu3 (AP): APIC ID: 3 cpu4 (AP): APIC ID: 4 cpu5 (AP): APIC ID: 5 cpu6 (AP): APIC ID: 6 cpu7 (AP): APIC ID: 7 ioapic0 irqs 0-23 on motherboard ioapic1 irqs 24-47 on motherboard kbd1 at kbdmux0 ath_hal: 0.9.20.3 (AR5210, AR5211, AR5212, RF5111, RF5112, RF2413, RF5413) hptrr: HPT RocketRAID controller driver v1.1 (Jan 16 2008 04:43:12) acpi0: on motherboard acpi0: Power Button (fixed) Timecounter "ACPI-fast" frequency 3579545 Hz quality 1000 acpi_timer0: <24-bit timer at 3.579545MHz> port 0x1008-0x100b on acpi0 acpi_hpet0: iomem 0xfed00000-0xfed003ff on acpi0 Timecounter "HPET" frequency 14318180 Hz quality 900 cpu0: on acpi0 acpi_throttle0: on cpu0 cpu1: on acpi0 acpi_throttle1: on cpu1 acpi_throttle1: failed to attach P_CNT device_attach: acpi_throttle1 attach returned 6 cpu2: on acpi0 acpi_throttle2: on cpu2 acpi_throttle2: failed to attach P_CNT device_attach: acpi_throttle2 attach returned 6 cpu3: on acpi0 acpi_throttle3: on cpu3 acpi_throttle3: failed to attach P_CNT device_attach: acpi_throttle3 attach returned 6 cpu4: on acpi0 acpi_throttle4: on cpu4 acpi_throttle4: failed to attach P_CNT device_attach: acpi_throttle4 attach returned 6 cpu5: on acpi0 acpi_throttle5: on cpu5 acpi_throttle5: failed to attach P_CNT device_attach: acpi_throttle5 attach returned 6 cpu6: on acpi0 acpi_throttle6: on cpu6 acpi_throttle6: failed to attach P_CNT device_attach: acpi_throttle6 attach returned 6 cpu7: on acpi0 acpi_throttle7: on cpu7 acpi_throttle7: failed to attach P_CNT device_attach: acpi_throttle7 attach returned 6 pcib0: port 0xcf8-0xcff on acpi0 pci0: on pcib0 pcib1: at device 2.0 on pci0 pci1: on pcib1 pcib2: irq 16 at device 0.0 on pci1 pci2: on pcib2 pcib3: irq 16 at device 0.0 on pci2 pci3: on pcib3 pcib4: at device 0.0 on pci3 pci4: on pcib4 pcib5: at device 0.2 on pci3 pci5: on pcib5 pcib6: irq 18 at device 2.0 on pci2 pci6: on pcib6 em0: port 0x2000-0x201f mem 0xd8220000-0xd823ffff,0xd8200000-0xd821ffff irq 18 at device 0.0 on pci6 em0: Ethernet address: 00:30:48:c4:cb:7a em1: port 0x2020-0x203f mem 0xd8260000-0xd827ffff,0xd8240000-0xd825ffff irq 19 at device 0.1 on pci6 em1: Ethernet address: 00:30:48:c4:cb:7b pcib7: at device 0.3 on pci1 pci7: on pcib7 pcib8: at device 4.0 on pci0 pci8: on pcib8 pcib9: at device 6.0 on pci0 pci9: on pcib9 mpt0: port 0x3000-0x30ff mem 0xd8410000-0xd8413fff,0xd8400000-0xd840ffff irq 18 at device 0.0 on pci9 mpt0: [GIANT-LOCKED] mpt0: MPI Version=1.5.14.0 mpt0: mpt_cam_event: 0x16 mpt0: Unhandled Event Notify Frame. Event 0x16 (ACK not required). mpt0: mpt_cam_event: 0x12 mpt0: Unhandled Event Notify Frame. Event 0x12 (ACK not required). mpt0: mpt_cam_event: 0x12 mpt0: Unhandled Event Notify Frame. Event 0x12 (ACK not required). mpt0: mpt_cam_event: 0x12 mpt0: Unhandled Event Notify Frame. Event 0x12 (ACK not required). mpt0: mpt_cam_event: 0x12 mpt0: Unhandled Event Notify Frame. Event 0x12 (ACK not required). mpt0: mpt_cam_event: 0x16 mpt0: Unhandled Event Notify Frame. Event 0x16 (ACK not required). mpt0: mpt_cam_event: 0x16 mpt0: Unhandled Event Notify Frame. Event 0x16 (ACK not required). mpt0: mpt_cam_event: 0x16 mpt0: Unhandled Event Notify Frame. Event 0x16 (ACK not required). pci0: at device 8.0 (no driver attached) pcib10: irq 17 at device 28.0 on pci0 pci10: on pcib10 uhci0: port 0x1800-0x181f irq 17 at device 29.0 on pci0 uhci0: [GIANT-LOCKED] usb0: on uhci0 usb0: USB revision 1.0 uhub0: Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1 uhub0: 2 ports with 2 removable, self powered uhci1: port 0x1820-0x183f irq 19 at device 29.1 on pci0 uhci1: [GIANT-LOCKED] usb1: on uhci1 usb1: USB revision 1.0 uhub1: Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1 uhub1: 2 ports with 2 removable, self powered uhci2: port 0x1840-0x185f irq 18 at device 29.2 on pci0 uhci2: [GIANT-LOCKED] usb2: on uhci2 usb2: USB revision 1.0 uhub2: Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1 uhub2: 2 ports with 2 removable, self powered ehci0: mem 0xd8600000-0xd86003ff irq 17 at device 29.7 on pci0 ehci0: [GIANT-LOCKED] usb3: EHCI version 1.0 usb3: companion controllers, 2 ports each: usb0 usb1 usb2 usb3: on ehci0 usb3: USB revision 2.0 uhub3: Intel EHCI root hub, class 9/0, rev 2.00/1.00, addr 1 uhub3: 6 ports with 6 removable, self powered pcib11: at device 30.0 on pci0 pci11: on pcib11 pci11: at device 1.0 (no driver attached) isab0: at device 31.0 on pci0 isa0: on isab0 atapci0: port 0x1f0-0x1f7,0x3f6,0x170-0x177,0x376,0x1860-0x186f at device 31.1 on pci0 ata0: on atapci0 ata1: on atapci0 atapci1: port 0x18a0-0x18a7,0x1874-0x1877,0x1878-0x187f,0x1870-0x1873,0x1880-0x189f mem 0xd8600400-0xd86007ff irq 19 at device 31.2 on pci0 atapci1: AHCI Version 01.10 controller with 6 ports detected ata2: on atapci1 ata3: on atapci1 ata4: on atapci1 ata5: on atapci1 ata6: on atapci1 ata7: on atapci1 pci0: at device 31.3 (no driver attached) acpi_button0: on acpi0 atkbdc0: port 0x60,0x64 irq 1 on acpi0 atkbd0: irq 1 on atkbdc0 kbd0 at atkbd0 atkbd0: [GIANT-LOCKED] sio0: <16550A-compatible COM port> port 0x3f8-0x3ff irq 4 flags 0x10 on acpi0 sio0: type 16550A sio1: <16550A-compatible COM port> port 0x2f8-0x2ff irq 3 on acpi0 sio1: type 16550A fdc0: port 0x3f0-0x3f5,0x3f7 irq 6 drq 2 on acpi0 fdc0: [FAST] ppc0: port 0x378-0x37f,0x778-0x77f irq 7 drq 3 on acpi0 ppc0: SMC-like chipset (ECP/EPP/PS2/NIBBLE) in COMPATIBLE mode ppc0: FIFO with 16/16/9 bytes threshold ppbus0: on ppc0 plip0: on ppbus0 lpt0: on ppbus0 lpt0: Interrupt-driven port ppi0: on ppbus0 pmtimer0 on isa0 orm0: at iomem 0xc0000-0xcafff,0xcb000-0xcd7ff on isa0 sc0: at flags 0x100 on isa0 sc0: VGA <16 virtual consoles, flags=0x300> vga0: at port 0x3c0-0x3df iomem 0xa0000-0xbffff on isa0 Timecounters tick every 1.000 msec hptrr: no controller detected. acd0: DVDROM at ata0-slave UDMA33 ad4: 76319MB at ata2-master SATA300 da0 at mpt0 bus 0 target 31 lun 0 da0: Fixed Direct Access SCSI-5 device da0: 300.000MB/s transfers, Tagged Queueing Enabled da0: 953869MB (1953525168 512 byte sectors: 255H 63S/T 121601C) da1 at mpt0 bus 0 target 32 lun 0 da1: Fixed Direct Access SCSI-5 device da1: 300.000MB/s transfers, Tagged Queueing Enabled da1: 953869MB (1953525168 512 byte sectors: 255H 63S/T 121601C) da2 at mpt0 bus 0 target 33 lun 0 da2: Fixed Direct Access SCSI-5 device da2: 300.000MB/s transfers, Tagged Queueing Enabled da2: 953869MB (1953525168 512 byte sectors: 255H 63S/T 121601C) da3 at mpt0 bus 0 target 34 lun 0 da3: Fixed Direct Access SCSI-5 device da3: 300.000MB/s transfers, Tagged Queueing Enabled da3: 953869MB (1953525168 512 byte sectors: 255H 63S/T 121601C) SMP: AP CPU #1 Launched! SMP: AP CPU #2 Launched! SMP: AP CPU #3 Launched! SMP: AP CPU #6 Launched! SMP: AP CPU #4 Launched! SMP: AP CPU #7 Launched! SMP: AP CPU #5 Launched! GEOM_MIRROR: Device gm0 created (id=2855202746). GEOM_MIRROR: Device gm0: provider da0 detected. GEOM_MIRROR: Device gm0: provider da1 detected. GEOM_MIRROR: Device gm0: provider da1 activated. GEOM_MIRROR: Device gm0: provider mirror/gm0 launched. GEOM_MIRROR: Device gm0: rebuilding provider da0. GEOM_MIRROR: Cannot add disk da3 to gm0 (error=17). Trying to mount root from ufs:/dev/ad4s1a em0: link state changed to UP GEOM_MIRROR: Device gm0: rebuilding provider da0 finished. GEOM_MIRROR: Device gm0: provider da0 activated. GEOM_MIRROR: Device gm0: provider da0 destroyed. GEOM_MIRROR: Device gm0: provider da0 detected. GEOM_MIRROR: Device gm0: rebuilding provider da0. State-Changed-From-To: feedback->feedback State-Changed-By: gavin State-Changed-When: Sun Mar 1 18:52:58 UTC 2009 State-Changed-Why: To submitter: do you know if this is reproduceable, or if it was a one-off? Also, is there any chance you can test the same actions on a 7.x system and see if the issue has already been fixed? http://www.freebsd.org/cgi/query-pr.cgi?pr=128282 From: Marius Strobl To: bug-followup@FreeBSD.org, dwayneh@ssimicro.com Cc: Subject: Re: kern/128282: [mpt] system failure on removing two drives Date: Mon, 15 Aug 2011 01:39:15 +0200 Chances are that this was fixed by r224494, which was MFC'ed to stable/8 in r224820 and to stable/7 in r224821. Could you please re-test with one of this revisions? Marius >Unformatted: