From milt@moth.vicor-nb.com Mon Apr 19 15:09:25 1999 Return-Path: Received: from moth.vicor-nb.com (moth.vicor-nb.com [208.206.78.46]) by hub.freebsd.org (Postfix) with ESMTP id 4D90515577 for ; Mon, 19 Apr 1999 15:09:11 -0700 (PDT) (envelope-from milt@moth.vicor-nb.com) Received: (from milt@localhost) by moth.vicor-nb.com (8.8.5/8.8.5) id PAA05235; Mon, 19 Apr 1999 15:08:24 -0700 (PDT) Message-Id: <199904192208.PAA05235@moth.vicor-nb.com> Date: Mon, 19 Apr 1999 15:08:24 -0700 (PDT) From: milt To: FreeBSD-gnats-submit@freebsd.org Cc: daver@moth.vicor-nb.com, jpl@moth.vicor-nb.com Subject: Invalid files on disk after fsync >Number: 11226 >Category: kern >Synopsis: Invalid files on disk after fsync >Confidential: no >Severity: critical >Priority: high >Responsible: alfred >State: closed >Quarter: >Keywords: >Date-Required: >Class: sw-bug >Submitter-Id: current-users >Arrival-Date: Mon Apr 19 15:10:00 PDT 1999 >Closed-Date: Mon Jun 26 11:34:51 PDT 2000 >Last-Modified: Mon Jun 26 11:36:19 PDT 2000 >Originator: milt@vicor-nb.com >Release: FreeBSD 2.2.6-STABLE >Organization: vicor >Environment: We are running a very busy server which includes scsi drives and lots of nfs mounts. df and dmesg are: 52. ? df Filesystem 1K-blocks Used Avail Capacity Mounted on /dev/sd0s1a 59471 21463 33251 39% / /dev/sd0s1f 8202004 2540061 5005783 34% /usr /dev/sd0s1e 98479 4681 85920 5% /var /dev/sd1c 8621381 1 7931670 0% /disk2 procfs 4 4 0 100% /proc opshome:/usr/opshome 8241012 4715979 2865753 62% /usr/opshome oos0a:/raid1 60821938 10291873 45664310 18% /raid1 oos0a:/raid2 60821938 11946475 44009708 21% /raid2 oos0a:/raid3 60821938 73528 55882655 0% /raid3 oos0a:/usr/env1/data/lb 1843066 847170 848451 50% /usr/env1/data/lb oos0a:/usr/env5/data/lb 1843066 847170 848451 50% /usr/env5/data/lb oos0a:/usr/env3/data/lb 1843066 847170 848451 50% /usr/env3/data/lb oos0a:/usr/env4/data/lb 1843066 847170 848451 50% /usr/env4/data/lb Copyright (c) 1992-1998 FreeBSD Inc. Copyright (c) 1982, 1986, 1989, 1991, 1993 The Regents of the University of California. All rights reserved. FreeBSD 2.2.6-STABLE #0: Tue Apr 28 15:08:30 PDT 1998 root@ipm0.lbxrich.vicor-nb.com:/disk2/src.patched/sys/compile/WSS CPU: Pentium (199.43-MHz 586-class CPU) Origin = "GenuineIntel" Id = 0x544 Stepping=4 Features=0x8001bf real memory = 134217728 (131072K bytes) avail memory = 129515520 (126480K bytes) Probing for devices on PCI bus 0: chip0 rev 1 on pci0:0:0 chip1 rev 1 on pci0:7:0 chip2 rev 1 on pci0:7:1 chip3 rev 1 int d irq 11 on pci0:7:2 chip4 rev 1 on pci0:7:3 ahc0 rev 0 int a irq 10 on pci0:9:0 ahc0: aic7870 Single Channel, SCSI Id=7, 16 SCBs ahc0 waiting for scsi devices to settle ahc0: target 0 Tagged Queuing Device (ahc0:0:0): "SEAGATE ST19171N 0023" type 0 fixed SCSI 2 sd0(ahc0:0:0): Direct-Access 8683MB (17783112 512 byte sectors) ahc0: target 1 Tagged Queuing Device (ahc0:1:0): "SEAGATE ST19171N 0023" type 0 fixed SCSI 2 sd1(ahc0:1:0): Direct-Access 8683MB (17783112 512 byte sectors) (ahc0:3:0): "HP C1533A A708" type 1 removable SCSI 2 st0(ahc0:3:0): Sequential-Access density code 0x24, variable blocks, write-enabled fxp0 rev 1 int a irq 9 on pci0:10:0 fxp0: Ethernet address 00:a0:c9:27:93:fd vga0 rev 0 int a irq 12 on pci0:11:0 Probing for devices on the ISA bus: sc0 at 0x60-0x6f irq 1 on motherboard sc0: VGA color <16 virtual consoles, flags=0x0> sio0 at 0x3f8-0x3ff irq 4 on isa sio0: type 16550A sio1 at 0x2f8-0x2ff irq 3 on isa sio1: type 16550A psm0 not found at 0x60 fdc0 at 0x3f0-0x3f7 irq 6 drq 2 on isa fdc0: FIFO enabled, 8 bytes threshold fd0: 1.44MB 3.5in npx0 on motherboard npx0: INT 16 interface Intel Pentium F00F detected, installing workaround >Description: What we notice is invalid file contents. The sequence is: 1. Our Qft utility writes a file by using a open,write/write/...,fsync,close sequence: int outFd = open (inPath, O_WRONLY | O_CREAT | O_TRUNC, 0664); if (outFd <= 0) ReportError ("Unable to open " << inPath << " for writing") while (... ... receive 8192 bytes from tcp if (write (fd, fileBuffer, sizeof (fileBuffer)) != sizeof (fileBuffer)) ReportError ("Error Writing File " << inPath); ... the last write may be less than 8192 bytes. if (fsync (outFd)) ReportError ("fsync failure on file " << inPath) close (outFd); 2. Some program (maybe a backup utility, maybe Qft) reads the file and receives valid contents. 3. Some program reads the file and receives invalid contents. *. At this point, the data on disk is permanently invalid. *. The problem consists of 8192 wrong bytes which start on a page boundary in the middle of the file. The first time this came up, the bad data consisted of two short files which had existed some weeks prior to the problem. The second time this came up, the bad data looked like part of an executeable and could have been old or new. *. The file date/time modified (from ls -l) remains the time of the initial file write from step 1. THAT IS: we write a file, we read it and find it good, we read it and find it invalid even though the date hasn't changed. Probably you will think I am making this up, confused, or incompetent. Sounds pretty phenomenal. Tell me how one could accidentally change the file contents without changing the access date? Also, the valid read is happening within 5 to 20 minutes of the initial file write and may well be seeing data which is in cache but never makes it to disk. (The invalid read is happening hours or days later.) >How-To-Repeat: I wish I could. I spent all week end trying. The cause is going to have to include some unusual interaction or other or I would have duplicated it by now. We have seen two incidents one month apart on two different hosts at the same site (our busiest site). >Fix: I wish I knew. Fortunately it doesn't come up often. >Release-Note: >Audit-Trail: From: "Ted Mittelstaedt" To: , Cc: Subject: Re: kern/11226: Invalid files on disk after fsync Date: Sun, 12 Sep 1999 00:20:39 -0700 Hi All, I was struck by the similarities between this and what happens with our news server. Once every 2 or 3 months I come in to find the server rebooted itself during the night - and I'm sure it's due to a SCSI error because the news spool is always scrambled. Usually, multiple invocations of fsck will eventually clear the garbage out but I've had to regenerate the spools once because of it. The only one time that I have ever seen anything in the message log regarding this is a rather cryptic log entry: Aug 11 06:12:59 herald /kernel: biodone: buffer already done This was made seconds before the system rebooted itself during one of those times. The server has done up to 2 million articles in a day and I'm sure that it's during the peak times that this has happened. I suspect there is some sequencing bug or other within the Adaptec SCSI driver - of course I'm running with all go-fast options turned on including async mounts on the spools. Output of dmesg follows: # dmesg Copyright (c) 1992-1998 FreeBSD Inc. Copyright (c) 1982, 1986, 1989, 1991, 1993 The Regents of the University of California. All rights reserved. FreeBSD 2.2.8-RELEASE #0: Mon Jan 25 00:50:14 PST 1999 tedm@herald.ipinc.net:/usr/src/sys/compile/HERALD CPU: Pentium/P54C (166.19-MHz 586-class CPU) Origin = "GenuineIntel" Id = 0x52c Stepping=12 Features=0x1bf real memory = 134217728 (131072K bytes) avail memory = 129216512 (126188K bytes) Probing for devices on PCI bus 0: chip0 rev 3 on pci0:0:0 ahc0 rev 1 int a irq 9 on pci0:2:0 ahc0: aic7860 Single Channel, SCSI Id=7, 3/8 SCBs ahc0 waiting for scsi devices to settle ahc0: target 0 Tagged Queuing Device (ahc0:0:0): "SEAGATE ST51080N 0958" type 0 fixed SCSI 2 sd0(ahc0:0:0): Direct-Access 1030MB (2109840 512 byte sectors) ahc1 rev 1 int a irq 9 on pci0:3:0 ahc1: aic7880 Wide Channel, SCSI Id=7, 16/255 SCBs ahc1 waiting for scsi devices to settle Sending SDTR!! ahc1: target 4 Tagged Queuing Device (ahc1:4:0): "IBM OEM DCHS09F 2222" type 0 fixed SCSI 2 sd1(ahc1:4:0): Direct-Access 8689MB (17796077 512 byte sectors) Sending SDTR!! ahc1: target 5 Tagged Queuing Device (ahc1:5:0): "IBM OEM DCHS09F 2222" type 0 fixed SCSI 2 sd2(ahc1:5:0): Direct-Access 8689MB (17796077 512 byte sectors) Sending SDTR!! ahc1: target 6 Tagged Queuing Device (ahc1:6:0): "IBM OEM DCHS09F 2222" type 0 fixed SCSI 2 sd3(ahc1:6:0): Direct-Access 8689MB (17796077 512 byte sectors) ahc2 rev 1 int a irq 10 on pci0:4:0 ahc2: aic7880 Wide Channel, SCSI Id=7, 16/255 SCBs ahc2 waiting for scsi devices to settle ahc2: target 0 Tagged Queuing Device (ahc2:0:0): "SEAGATE ST39102LW 0005" type 0 fixed SCSI 2 sd4(ahc2:0:0): Direct-Access 8683MB (17783240 512 byte sectors) ahc2: target 1 Tagged Queuing Device (ahc2:1:0): "SEAGATE ST39102LW 0005" type 0 fixed SCSI 2 sd5(ahc2:1:0): Direct-Access 8683MB (17783240 512 byte sectors) ahc2: target 2 Tagged Queuing Device (ahc2:2:0): "SEAGATE ST39102LW 0005" type 0 fixed SCSI 2 sd6(ahc2:2:0): Direct-Access 8683MB (17783240 512 byte sectors) chip1 rev 1 on pci0:7:0 chip2 rev 0 on pci0:7:1 vga0 rev 0 on pci0:15:0 Probing for devices on the ISA bus: sc0 at 0x60-0x6f irq 1 on motherboard sc0: VGA color <16 virtual consoles, flags=0x0> ed0 at 0x300-0x31f irq 5 maddr 0xdc000 msize 16384 on isa ed0: address 00:00:c0:1e:eb:23, type WD8013W (16 bit) sio0 at 0x3f8-0x3ff irq 4 on isa sio0: type 16550A lpt0 at 0x378-0x37f irq 7 on isa lpt0: Interrupt-driven port lp0: TCP/IP capable interface psm0 at 0x60-0x64 irq 12 on motherboard psm0: model Generic PS/2 mouse, device ID 0 fdc0 at 0x3f0-0x3f7 irq 6 drq 2 on isa fdc0: FIFO enabled, 8 bytes threshold fd0: 1.44MB 3.5in wdc0 at 0x1f0-0x1f7 irq 14 on isa wdc0: unit 0 (atapi): <685A/8.4D>, removable, dma, iordy wcd0: 1171KB/sec, 120KB cache, audio play, 255 volume levels, ejectable tray wcd0: no disc inside, unlocked npx0 flags 0x1 on motherboard npx0: INT 16 interface Intel Pentium F00F detected, installing workaround ccd0-1: Concatenated disk drivers Ted From: milt@vicor-nb.com (milt) To: freebsd-gnats-submit@freebsd.org, milt@moth.vicor-nb.com, tedm@toybox.placo.com Cc: Subject: Re: kern/11226: Invalid files on disk after fsync Date: Sun, 12 Sep 1999 14:10:10 -0700 (PDT) HIYA > I was struck by the similarities between this and what happens with our > news server. > > Once every 2 or 3 months I come in to find the server rebooted itself > during the night - and I'm sure it's due to a SCSI error because the news > spool is always scrambled. Usually, multiple invocations of fsck will > eventually clear the garbage out but I've had to regenerate the spools once > because of it. > > The only one time that I have ever seen anything in the message log > regarding this is a rather cryptic log entry: > > Aug 11 06:12:59 herald /kernel: biodone: buffer already done > > This was made seconds before the system rebooted itself during one of those > times. > > The server has done up to 2 million articles in a day and I'm sure that it's > during the peak times that this has happened. > > I suspect there is some sequencing bug or other within the Adaptec SCSI > driver - of course I'm running with all go-fast options turned on including > async mounts on the spools. Output of dmesg follows: This problem has been one of the most frustrating I have encountered. I think, Ted, that you are the only one in the world who ever really believed my description. What I don't see in the current problem report is that we eventually realized that the problem was always accompanied by an invalid disk address in an inode. We found that the disk fragment listed for file address 0xA000 for one or more files had one or both of the bits in 0x18000 cleared. That still got us nowhere. It did explain why fsck was going nuts and destroying all sorts of things when this happened. Two things make this frustrating: 1. I tried for weeks to make it happen in a test environment. Got a few failures, but never in any controlled manner so I was never able to reproduce the failure reliably. 2. We managed to get out of the soup without understanding or fixing the problem. (I hate when that happens!) a. We now mount all file systems with noatime. Since most of our accesses are reads, noatime cuts the inode writes down quite a bit and this change reduced the problem from about 6 a week to about 1 every two weeks. (I don't have very good numbers on the frequency, the problems were spread out over 20 different hosts in 3 cities.) (It's rw,noatime in fstab - see man mount) It's one of the speedup things so probably you already have it. b. The problem has dissappeared. We haven't had trouble in the past coupla months. I can't prove why, but I think its because we are using NFS less these days. (Some perl scripts running on the offensive hosts are now using non-NFS methods to get data from other hosts - NFS traffic is reduced but not yet eliminated.) You sure have my sympathy. I bet it ain't fun watching files dissappear as you run fsck multiple times on your 50+ gig system! State-Changed-From-To: open->feedback State-Changed-By: alfred State-Changed-When: Tue Jun 6 03:05:16 PDT 2000 State-Changed-Why: Very old PR, waiting for feedback if 4.0 addresses the issue. Responsible-Changed-From-To: freebsd-bugs->alfred Responsible-Changed-By: alfred Responsible-Changed-When: Tue Jun 6 03:05:16 PDT 2000 Responsible-Changed-Why: Cleaning PRs, would like feedback. http://www.freebsd.org/cgi/query-pr.cgi?pr=11226 State-Changed-From-To: feedback->closed State-Changed-By: alfred State-Changed-When: Mon Jun 26 11:34:51 PDT 2000 State-Changed-Why: no feedback and no recent problems reported that are the same. http://www.freebsd.org/cgi/query-pr.cgi?pr=11226 >Unformatted: Reply-To: milt@vicor-nb.com X-send-pr-version: 3.2