From markk@knigma.org Sun Sep 8 14:35:58 2002 Return-Path: Received: from mx1.FreeBSD.org (mx1.FreeBSD.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 9D35737B400 for ; Sun, 8 Sep 2002 14:35:58 -0700 (PDT) Received: from shrewd.knigma.org (shrewd.demon.co.uk [212.229.151.45]) by mx1.FreeBSD.org (Postfix) with ESMTP id 8F6B343E72 for ; Sun, 8 Sep 2002 14:35:57 -0700 (PDT) (envelope-from markk@knigma.org) Received: from shrewd.lan.knigma.org (localhost [127.0.0.1]) by shrewd.knigma.org (8.12.6/8.12.6) with ESMTP id g88LZtKV001655 for ; Sun, 8 Sep 2002 22:35:56 +0100 (BST) (envelope-from mkn@shrewd.lan.knigma.org) Received: (from mkn@localhost) by shrewd.lan.knigma.org (8.12.6/8.12.6/Submit) id g88LZtaF001654; Sun, 8 Sep 2002 22:35:55 +0100 (BST) (envelope-from mkn) Message-Id: <200209082135.g88LZtaF001654@shrewd.lan.knigma.org> Date: Sun, 8 Sep 2002 22:35:55 +0100 (BST) From: Mark Knight Reply-To: Mark Knight To: FreeBSD-gnats-submit@freebsd.org Cc: Subject: ATA Tagged Queuing wedges -STABLE X-Send-Pr-Version: 3.113 X-GNATS-Notify: >Number: 42563 >Category: kern >Synopsis: ATA Tagged Queuing wedges -STABLE >Confidential: no >Severity: critical >Priority: medium >Responsible: sos >State: closed >Quarter: >Keywords: >Date-Required: >Class: sw-bug >Submitter-Id: current-users >Arrival-Date: Sun Sep 08 14:40:04 PDT 2002 >Closed-Date: Mon Aug 11 12:45:48 PDT 2003 >Last-Modified: Mon Aug 11 12:45:48 PDT 2003 >Originator: Mark Knight >Release: FreeBSD 4.7-PRERELEASE i386 >Organization: >Environment: System: FreeBSD shrewd.lan.knigma.org 4.7-PRERELEASE FreeBSD 4.7-PRERELEASE #0: Sun Sep 8 13:07:18 BST 2002 root@shrewd.lan.knigma.org:/slave/usr/obj/usr/src/sys/SHREWD i386 >Description: ad0: READ command timeout tag=0 serv=0 - resetting ad0: invalidating queued requests ata0: resetting devices .. ad0: invalidating queued requests ad0: DMA limited to UDMA33, non-ATA66 cable or device ad1: invalidating queued requests done ad0: timeout waiting for READY ad0: invalidating queued requests - resetting ata0: resetting devices .. ad0: invalidating queued requests ad0: DMA limited to UDMA33, non-ATA66 cable or device ad1: invalidating queued requests ad0: no request for tag=0 ... until the system wedges solid Further information: http://www.knigma.org/freebsd/tagdeath/info.txt >How-To-Repeat: Occured twice. I was using dump from ad0 -> a file on ad1. At the same time, I was browsing some large .jpg's on ad0 using xzgv. On both occations the dump had been running for several hours before the failure. The simultaneous xzgv appeared to be the trigger. Before setting hw.ata.tags="1" today, to test 4.7-PRERELEASE (first time I've every tried tagged queuing), the system was very stable. >Fix: Don't set hw.ata.tags="1". >Release-Note: >Audit-Trail: Mail from Sean Chittenden Not to "me too" this bug, but in the absence of voting for bugs, let me chime in with what I've learned about ata tags being broken. While performing an egrep -r'ing over NFS or even a local buildworld while Sup'ing ports on an IBM IDE drive (UDMA100, though I don't think this matters at all), I get a ton of warnings from the kernel about ata tag queuing: Apr 23 20:16:27 elaine /kernel: ad0: READ command timeout tag=0 serv=1 - resetting Apr 23 20:16:58 elaine /kernel: ad0: invalidating queued requests Apr 23 20:16:58 elaine /kernel: ata0: resetting devices .. ad0: invalidating queued requests Apr 23 20:16:58 elaine /kernel: done Apr 23 20:16:58 elaine /kernel: ad0: WRITE command timeout tag=1 serv=0 - resetting Apr 23 20:16:58 elaine /kernel: ad0: invalidating queued requests Apr 23 20:16:58 elaine /kernel: ata0: resetting devices .. ad0: invalidating queued requests Apr 23 20:16:58 elaine /kernel: done Apr 23 20:16:59 elaine /kernel: ad0: READ command timeout tag=0 serv=0 - resetting Apr 23 20:16:59 elaine /kernel: ad0: invalidating queued requests Apr 23 20:16:59 elaine /kernel: ata0: resetting devices .. ad0: invalidating queued requests Apr 23 20:16:59 elaine /kernel: done Apr 23 20:16:59 elaine /kernel: ad0: timeout waiting for READY Apr 23 20:16:59 elaine /kernel: ad0: invalidating queued requests Apr 23 20:16:59 elaine /kernel: ad0: timeout sending command=00 s=d0 e=04 Apr 23 20:16:59 elaine /kernel: ad0: flush queue failed Apr 23 20:16:59 elaine /kernel: - resetting Lovely. This is a headless server so I haven't been able to catch a panic, but it's clear that tag queuing is broken. At the very least tag queing needs to be documented as unreliable at the moment and ata(4) and tuning(7) need to be adjusted accordingly. I plan on updating docs for -STABLE in 72hrs unless I hear otherwise. Responsible-Changed-From-To: freebsd-bugs->sos Responsible-Changed-By: kris Responsible-Changed-When: Mon Jul 14 02:28:32 PDT 2003 Responsible-Changed-Why: Assign to ATA maintainer http://www.freebsd.org/cgi/query-pr.cgi?pr=42563 State-Changed-From-To: open->closed State-Changed-By: sos State-Changed-When: Mon Aug 11 12:43:37 PDT 2003 State-Changed-Why: Try to upgrade to 4.8. If that doesn't help, disable tags. I've decided not to support tags anymore due to the endless problems with it, and only former IBM did mak disks with it. http://www.freebsd.org/cgi/query-pr.cgi?pr=42563 >Unformatted: