From mharo@bitsurf.net Sat Dec 29 19:25:53 2007 Return-Path: Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 8E11616A417 for ; Sat, 29 Dec 2007 19:25:53 +0000 (UTC) (envelope-from mharo@bitsurf.net) Received: from coffee.bitsurf.net (coffee.bitsurf.net [66.92.2.80]) by mx1.freebsd.org (Postfix) with ESMTP id 2792513C458 for ; Sat, 29 Dec 2007 19:25:53 +0000 (UTC) (envelope-from mharo@bitsurf.net) Received: from zfsserver.mtv.bitsurf.net (cowabunga.bitsurf.net [66.92.2.81]) by coffee.bitsurf.net (8.13.8/8.13.8) with ESMTP id lBTJQKLO091842 for ; Sat, 29 Dec 2007 11:26:20 -0800 (PST) (envelope-from mharo@bitsurf.net) Received: from zfsserver.mtv.bitsurf.net (localhost [127.0.0.1]) by zfsserver.mtv.bitsurf.net (8.14.2/8.14.2) with ESMTP id lBTJPpb0008096 for ; Sat, 29 Dec 2007 11:25:51 -0800 (PST) (envelope-from mharo@zfsserver.mtv.bitsurf.net) Received: (from mharo@localhost) by zfsserver.mtv.bitsurf.net (8.14.2/8.14.2/Submit) id lBTJPpEO008095; Sat, 29 Dec 2007 11:25:51 -0800 (PST) (envelope-from mharo) Message-Id: <200712291925.lBTJPpEO008095@zfsserver.mtv.bitsurf.net> Date: Sat, 29 Dec 2007 11:25:51 -0800 (PST) From: Michael Haro Reply-To: Michael Haro To: FreeBSD-gnats-submit@freebsd.org Cc: Subject: Kernel panic with sata drive and dma problem X-Send-Pr-Version: 3.113 X-GNATS-Notify: >Number: 119140 >Category: kern >Synopsis: [ata] [panic] Kernel panic with sata drive and dma problem >Confidential: no >Severity: serious >Priority: medium >Responsible: freebsd-bugs >State: open >Quarter: >Keywords: >Date-Required: >Class: sw-bug >Submitter-Id: current-users >Arrival-Date: Sat Dec 29 19:30:01 UTC 2007 >Closed-Date: >Last-Modified: Tue May 12 04:45:32 UTC 2009 >Originator: Michael Haro >Release: FreeBSD 7.0-PRERELEASE i386 >Organization: >Environment: System: FreeBSD zfsserver.mtv.bitsurf.net 7.0-PRERELEASE FreeBSD 7.0-PRERELEASE #4: Sun Dec 23 16:46:49 PST 2007 root@zfsserver.mtv.bitsurf.net:/usr/obj/usr/src/sys/KERNEL i386 dmesg: Copyright (c) 1992-2007 The FreeBSD Project. Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 The Regents of the University of California. All rights reserved. FreeBSD is a registered trademark of The FreeBSD Foundation. FreeBSD 7.0-PRERELEASE #4: Sun Dec 23 16:46:49 PST 2007 root@zfsserver.mtv.bitsurf.net:/usr/obj/usr/src/sys/KERNEL Timecounter "i8254" frequency 1193182 Hz quality 0 CPU: AMD Athlon(tm) XP 2000+ (1659.61-MHz 686-class CPU) Origin = "AuthenticAMD" Id = 0x680 Stepping = 0 Features=0x383f9ff AMD Features=0xc0400800 real memory = 1073676288 (1023 MB) avail memory = 1036881920 (988 MB) kbd1 at kbdmux0 ath_hal: 0.9.20.3 (AR5210, AR5211, AR5212, RF5111, RF5112, RF2413, RF5413) hptrr: HPT RocketRAID controller driver v1.1 (Dec 23 2007 16:46:09) acpi0: on motherboard acpi0: [ITHREAD] acpi0: Power Button (fixed) Timecounter "ACPI-fast" frequency 3579545 Hz quality 1000 acpi_timer0: <24-bit timer at 3.579545MHz> port 0x808-0x80b on acpi0 cpu0: on acpi0 pcib0: port 0xcf8-0xcff on acpi0 pci0: on pcib0 agp0: on hostb0 pcib1: at device 1.0 on pci0 pci1: on pcib1 isab0: at device 2.0 on pci0 isa0: on isab0 ohci0: mem 0xcfbdd000-0xcfbddfff irq 12 at device 2.2 on pci0 ohci0: [GIANT-LOCKED] ohci0: [ITHREAD] usb0: OHCI version 1.0, legacy support usb0: on ohci0 usb0: USB revision 1.0 uhub0: on usb0 uhub0: 3 ports with 3 removable, self powered ohci1: mem 0xcfbde000-0xcfbdefff irq 10 at device 2.3 on pci0 ohci1: [GIANT-LOCKED] ohci1: [ITHREAD] usb1: OHCI version 1.0, legacy support usb1: on ohci1 usb1: USB revision 1.0 uhub1: on usb1 uhub1: 3 ports with 3 removable, self powered atapci0: port 0x1f0-0x1f7,0x3f6,0x170-0x177,0x376,0xff00-0xff0f at device 2.5 on pci0 ata0: on atapci0 ata0: [ITHREAD] ata1: on atapci0 ata1: [ITHREAD] pci0: at device 2.7 (no driver attached) sis0: port 0xc800-0xc8ff mem 0xcfbdc000-0xcfbdcfff irq 10 at device 3.0 on pci0 miibus0: on sis0 rlphy0: PHY 1 on miibus0 rlphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto sis0: Ethernet address: 00:0a:e6:8a:3b:e3 sis0: [ITHREAD] atapci1: port 0xdc00-0xdc3f,0xd800-0xd80f,0xd400-0xd47f mem 0xcfbdf000-0xcfbdffff,0xcfba0000-0xcfbbffff irq 5 at device 13.0 on pci0 atapci1: [ITHREAD] atapci1: [ITHREAD] ata2: on atapci1 ata2: [ITHREAD] ata3: on atapci1 ata3: [ITHREAD] ata4: on atapci1 ata4: [ITHREAD] atapci2: port 0xc400-0xc47f,0xc000-0xc0ff mem 0xcfbdb000-0xcfbdbfff,0xcfb60000-0xcfb7ffff irq 12 at device 15.0 on pci0 atapci2: [ITHREAD] atapci2: [ITHREAD] ata5: on atapci2 ata5: [ITHREAD] ata6: on atapci2 ata6: [ITHREAD] ata7: on atapci2 ata7: [ITHREAD] ata8: on atapci2 ata8: [ITHREAD] vgapci0: mem 0xcfc00000-0xcfffffff,0xcfbf0000-0xcfbfffff,0xcf400000-0xcf7fffff irq 11 at device 17.0 on pci0 uhci0: port 0xb800-0xb81f irq 11 at device 19.0 on pci0 uhci0: [GIANT-LOCKED] uhci0: [ITHREAD] usb2: on uhci0 usb2: USB revision 1.0 uhub2: on usb2 uhub2: 2 ports with 2 removable, self powered uhci1: port 0xbc00-0xbc1f irq 11 at device 19.1 on pci0 uhci1: [GIANT-LOCKED] uhci1: [ITHREAD] usb3: on uhci1 usb3: USB revision 1.0 uhub3: on usb3 uhub3: 2 ports with 2 removable, self powered ehci0: mem 0xcfbdaf00-0xcfbdafff irq 5 at device 19.2 on pci0 ehci0: [GIANT-LOCKED] ehci0: [ITHREAD] usb4: EHCI version 0.95 usb4: companion controllers, 2 ports each: usb2 usb3 usb4: on ehci0 usb4: USB revision 2.0 uhub4: on usb4 uhub4: 4 ports with 4 removable, self powered acpi_button0: on acpi0 atkbdc0: port 0x60,0x64 irq 1 on acpi0 atkbd0: irq 1 on atkbdc0 kbd0 at atkbd0 atkbd0: [GIANT-LOCKED] atkbd0: [ITHREAD] fdc0: port 0x3f2-0x3f3,0x3f4-0x3f5,0x3f7 irq 6 drq 2 on acpi0 fdc0: [FILTER] sio0: <16550A-compatible COM port> port 0x3f8-0x3ff irq 4 flags 0x10 on acpi0 sio0: type 16550A sio0: [FILTER] sio1: <16550A-compatible COM port> port 0x2f8-0x2ff irq 3 on acpi0 sio1: type 16550A sio1: [FILTER] pmtimer0 on isa0 orm0: at iomem 0xc0000-0xc7fff,0xcb800-0xd07ff,0xd0800-0xd87ff pnpid ORM0000 on isa0 sc0: at flags 0x100 on isa0 sc0: VGA <16 virtual consoles, flags=0x300> ppc0: at port 0x378-0x37f irq 7 on isa0 ppc0: Generic chipset (NIBBLE-only) in COMPATIBLE mode ppbus0: on ppc0 plip0: on ppbus0 lpt0: on ppbus0 lpt0: Interrupt-driven port ppi0: on ppbus0 ppc0: [GIANT-LOCKED] ppc0: [ITHREAD] vga0: at port 0x3c0-0x3df iomem 0xa0000-0xbffff on isa0 WARNING: ZFS is considered to be an experimental feature in FreeBSD. Timecounter "TSC" frequency 1659614133 Hz quality 800 Timecounters tick every 1.000 msec hptrr: no controller detected. ZFS filesystem version 6 ZFS storage pool version 6 ad0: 114473MB at ata0-master UDMA100 ad4: 715404MB at ata2-master SATA150 ad8: 239372MB at ata4-master UDMA133 ad10: 238475MB at ata5-master SATA150 ad12: 238475MB at ata6-master SATA150 ad14: 238475MB at ata7-master SATA150 ad16: 238475MB at ata8-master SATA150 Trying to mount root from zfs:tank zfsserver# pciconf -vl hostb0@pci0:0:0:0: class=0x060000 card=0x00000000 chip=0x07351039 rev=0x01 hdr=0x00 vendor = 'Silicon Integrated Systems (SiS)' device = 'SiS 735 Host-to-PCI Bridge' class = bridge subclass = HOST-PCI pcib1@pci0:0:1:0: class=0x060400 card=0x00000000 chip=0x00011039 rev=0x00 hdr=0x01 vendor = 'Silicon Integrated Systems (SiS)' device = 'SiS730 Virtual PCI-to-PCI bridge (AGP)' class = bridge subclass = PCI-PCI isab0@pci0:0:2:0: class=0x060100 card=0x00000000 chip=0x00081039 rev=0x00 hdr=0x00 vendor = 'Silicon Integrated Systems (SiS)' device = 'SiS PCI to ISA Bridge (LPC Bridge)' class = bridge subclass = PCI-ISA ohci0@pci0:0:2:2: class=0x0c0310 card=0x70011039 chip=0x70011039 rev=0x07 hdr=0x00 vendor = 'Silicon Integrated Systems (SiS)' device = 'SiS5597/8 Universal Serial Bus Controller' class = serial bus subclass = USB ohci1@pci0:0:2:3: class=0x0c0310 card=0x70011039 chip=0x70011039 rev=0x07 hdr=0x00 vendor = 'Silicon Integrated Systems (SiS)' device = 'SiS5597/8 Universal Serial Bus Controller' class = serial bus subclass = USB atapci0@pci0:0:2:5: class=0x010180 card=0x55131039 chip=0x55131039 rev=0xd0 hdr=0x00 vendor = 'Silicon Integrated Systems (SiS)' device = 'SiS5513 EIDE Controller (A,B step)' class = mass storage subclass = ATA none0@pci0:0:2:7: class=0x040100 card=0x030013f6 chip=0x70121039 rev=0xa0 hdr=0x00 vendor = 'Silicon Integrated Systems (SiS)' device = 'SiS7012 PCI Audio Accelerator' class = multimedia subclass = audio sis0@pci0:0:3:0: class=0x020000 card=0x09001039 chip=0x09001039 rev=0x90 hdr=0x00 vendor = 'Silicon Integrated Systems (SiS)' device = 'SiS900 sis 900 and integrated lan' class = network subclass = ethernet atapci1@pci0:0:13:0: class=0x018000 card=0x3375105a chip=0x3375105a rev=0x02 hdr=0x00 vendor = 'Promise Technology Inc' device = 'PDC20375(??) FastTrak SATA150 TX2plus Controller' class = mass storage atapci2@pci0:0:15:0: class=0x018000 card=0x3d18105a chip=0x3d18105a rev=0x02 hdr=0x00 vendor = 'Promise Technology Inc' device = 'Promise SATAII150 518 (tm) IDE Controller' class = mass storage vgapci0@pci0:0:17:0: class=0x030000 card=0x00000000 chip=0x96601023 rev=0xd3 hdr=0x00 vendor = 'Trident Microsystems' device = 'TGUI9660XGi/968x/938x GUI Accelerator' class = display subclass = VGA uhci0@pci0:0:19:0: class=0x0c0300 card=0x12340925 chip=0x30381106 rev=0x50 hdr=0x00 vendor = 'VIA Technologies Inc' device = 'VT83C572, VT6202 VIA Rev 5 or later USB Universal Host Controller' class = serial bus subclass = USB uhci1@pci0:0:19:1: class=0x0c0300 card=0x12340925 chip=0x30381106 rev=0x50 hdr=0x00 vendor = 'VIA Technologies Inc' device = 'VT83C572, VT6202 VIA Rev 5 or later USB Universal Host Controller' class = serial bus subclass = USB ehci0@pci0:0:19:2: class=0x0c0320 card=0x12340925 chip=0x31041106 rev=0x51 hdr=0x00 vendor = 'VIA Technologies Inc' device = 'VT6202/12 USB 2.0 Enhanced Host Controller' class = serial bus subclass = USB zfsserver# atacontrol list ATA channel 0: Master: ad0 ATA/ATAPI revision 5 Slave: no device present ATA channel 1: Master: no device present Slave: no device present ATA channel 2: Master: ad4 Serial ATA v1.0 Slave: no device present ATA channel 3: Master: no device present Slave: no device present ATA channel 4: Master: ad8 ATA/ATAPI revision 7 Slave: no device present ATA channel 5: Master: ad10 Serial ATA v1.0 Slave: no device present ATA channel 6: Master: ad12 Serial ATA v1.0 Slave: no device present ATA channel 7: Master: ad14 Serial ATA v1.0 Slave: no device present ATA channel 8: Master: ad16 Serial ATA v1.0 Slave: no device present zfsserver# smartctl -l error /dev/ad14 Error 4 occurred at disk power-on lifetime: 14449 hours (602 days + 1 hours) When the command that caused the error occurred, the device was active or idle. After command completion occurred, registers were: ER ST SC SN CL CH DH -- -- -- -- -- -- -- 84 51 00 b2 07 c7 e0 Error: ICRC, ABRT at LBA = 0x00c707b2 = 13043634 Commands leading to the command that caused the error were: CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name -- -- -- -- -- -- -- -- ---------------- -------------------- 25 00 40 73 f7 c6 ec 00 1d+23:59:15.200 READ DMA EXT c6 00 10 00 00 00 e0 00 1d+23:59:15.200 SET MULTIPLE MODE ef 02 00 00 00 00 e0 00 1d+23:59:15.200 SET FEATURES [Enable write cache] ef aa 00 00 00 00 e0 00 1d+23:59:15.200 SET FEATURES [Enable read look-ahead] ef 03 45 00 00 00 e0 00 1d+23:59:15.200 SET FEATURES [Set transfer mode] Error 3 occurred at disk power-on lifetime: 14448 hours (602 days + 0 hours) When the command that caused the error occurred, the device was active or idle. After command completion occurred, registers were: ER ST SC SN CL CH DH -- -- -- -- -- -- -- 84 51 00 0c bd c2 ec Error: ICRC, ABRT at LBA = 0x0cc2bd0c = 214088972 Commands leading to the command that caused the error were: CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name -- -- -- -- -- -- -- -- ---------------- -------------------- 35 00 80 8d bc c2 e0 00 1d+23:31:28.300 WRITE DMA EXT 35 00 80 0d bc c2 e0 00 1d+23:31:28.300 WRITE DMA EXT 35 00 80 8d bb c2 e0 00 1d+23:31:28.300 WRITE DMA EXT 35 00 80 0d bb c2 e0 00 1d+23:31:28.200 WRITE DMA EXT 35 00 80 8d ba c2 e0 00 1d+23:31:28.200 WRITE DMA EXT error 4 is the one that resulted in a panic. error 3 is the one that resulted in the drive going away and requiring the reboot. errors 1 and 2 are the same as error 3 and all happened yesterday. Yesterday I moved the computer into a different case. Prior to that a different drive (same model) was occasionally having the same problem. This leads me to believe that it's not a hard drive issue, but as they are all the same model and purchased at the same time I can't say that for sure. When this happened before I tried moving the drive onto my other sata controller and had the same results. Both are made by promise so it's possible that it wasn't a useful test to determine if it is a driver issue.. >Description: ad14 had disapeared as shown by the following in /var/log/messages: Dec 29 01:57:21 zfsserver kernel: ad14: FAILURE - device detached Dec 29 01:57:21 zfsserver kernel: subdisk14: detached Dec 29 01:57:21 zfsserver kernel: ad14: detached Dec 29 01:57:22 zfsserver root: ZFS: vdev failure, zpool=data type=vdev.open_failed I tried doing an atacontrol reinit ata7 to rediscover the drive, but that didn't find it, so I rebooted to bring it back. Then I ran a zpool scrub to check that the data was all happy. A couple minutes into it the kernel paniced. ad14 is connected to "Promise SATAII150 518 (tm) IDE Controller" Last few lines from /var/log/messages: Dec 29 02:24:08 zfsserver kernel: ad14: WARNING - SETFEATURES SET TRANSFER MODE taskqueue timeout - completing request directly Dec 29 02:24:12 zfsserver kernel: ad14: WARNING - SETFEATURES SET TRANSFER MODE taskqueue timeout - completing request directly Dec 29 02:24:16 zfsserver kernel: ad14: WARNING - SETFEATURES ENABLE RCACHE taskqueue timeout - completing request directly Dec 29 02:24:20 zfsserver kernel: ad14: WARNING - SETFEATURES ENABLE WCACHE taskqueue timeout - completing request directly Dec 29 02:24:24 zfsserver kernel: ad14: WARNING - SET_MULTI taskqueue timeout - completing request directly Dec 29 02:24:24 zfsserver kernel: ad14: TIMEOUT - READ_DMA48 retrying (1 retry left) LBA=482801523 Dec 29 02:24:24 zfsserver kernel: ad14: WARNING - READ_DMA48 UDMA ICRC error (retrying request) LBA=482801523 Dec 29 02:24:24 zfsserver root: ZFS: checksum mismatch, zpool=data path=/dev/ad14 offset=247190218240 size=32768 Dec 29 02:24:29 zfsserver kernel: ad14: TIMEOUT - READ_DMA48 retrying (1 retry left) LBA=482801651 # kgdb /boot/kernel/kernel.symbols /var/crash/vmcore.25 [GDB will not be able to debug user-mode threads: /usr/lib/libthread_db.so: Undefined symbol "ps_pglobal_lookup"] GNU gdb 6.1.1 [FreeBSD] Copyright 2004 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "i386-marcel-freebsd". Unread portion of the kernel message buffer: ad14: FAILURE - device detached subdisk14: detached ad14: detached Fatal trap 12: page fault while in kernel mode cpuid = 0; apic id = 00 fault virtual address = 0x2c fault code = supervisor write, page not present instruction pointer = 0x20:0xc0632e75 stack pointer = 0x28:0xef33bc5c frame pointer = 0x28:0xef33bc70 code segment = base 0x0, limit 0xfffff, type 0x1b = DPL 0, pres 1, def32 1, gran 1 processor eflags = interrupt enabled, resume, IOPL = 0 current process = 3 (g_up) trap number = 12 panic: page fault cpuid = 0 KDB: stack backtrace: db_trace_self_wrapper(c0984eeb,ef33baf8,c063f33f,c09a366c,0,...) at db_trace_self_wrapper+0x26 kdb_backtrace(c09a366c,0,c09649c3,ef33bb04,0,...) at kdb_backtrace+0x29 panic(c09649c3,c09a4913,c3f544d0,1,1,...) at panic+0x10f trap_fatal(c0a65020,0,2,8,dd313180,...) at trap_fatal+0x333 trap_pfault(c0a64ac8,ef33bb90,c066d3dd,ef33bbb4,c,...) at trap_pfault+0x250 trap(ef33bc1c) at trap+0x3c6 calltrap() at calltrap+0x6 --- trap 0xc, eip = 0xc0632e75, esp = 0xef33bc5c, ebp = 0xef33bc70 --- _mtx_lock_flags(1c,0,c0bf6e0d,1d8,c0bec2a0,...) at _mtx_lock_flags+0x15 vdev_geom_io_intr(c4e4f7bc,c0a17e04,0,0,0) at vdev_geom_io_intr+0x44 biodone(c4e4f7bc,c0a64a28,24c,c097d445,64,...) at biodone+0xad g_io_schedule_up(c3f0dc60,4c,c097e119,5b,0,...) at g_io_schedule_up+0x7f g_up_procbody(0,ef33bd38,0,ffffffff,ffffffff,...) at g_up_procbody+0x6c fork_exit(c05eea20,0,ef33bd38) at fork_exit+0x97 fork_trampoline() at fork_trampoline+0x8 --- trap 0, eip = 0, esp = 0xef33bd70, ebp = 0 --- Uptime: 8m0s Physical memory: 1011 MB Dumping 258 MB: 243 227 211 195 179 163 147 131 115 99 83 67 51 35 19 3 I'm not sure what else to report. >How-To-Repeat: I can't reproduce it. :-( >Fix: unknown >Release-Note: >Audit-Trail: Responsible-Changed-From-To: freebsd-bugs->sos Responsible-Changed-By: remko Responsible-Changed-When: Mon Dec 31 07:43:54 UTC 2007 Responsible-Changed-Why: Hi Soren this might be something for you.. http://www.freebsd.org/cgi/query-pr.cgi?pr=119140 Responsible-Changed-From-To: sos->freebsd-bugs Responsible-Changed-By: linimon Responsible-Changed-When: Tue May 12 04:45:24 UTC 2009 Responsible-Changed-Why: sos@ is not actively working on ATA-related PRs. http://www.freebsd.org/cgi/query-pr.cgi?pr=119140 >Unformatted: