From dillon@apollo.backplane.com Mon Feb 15 11:43:59 1999 Received: from apollo.backplane.com (apollo.backplane.com [209.157.86.2]) by hub.freebsd.org (8.8.8/8.8.8) with ESMTP id LAA20791 for ; Mon, 15 Feb 1999 11:43:58 -0800 (PST) (envelope-from dillon@apollo.backplane.com) Received: (from dillon@localhost) by apollo.backplane.com (8.9.3/8.9.1) id LAA18818; Mon, 15 Feb 1999 11:43:57 -0800 (PST) (envelope-from dillon) Message-Id: <199902151943.LAA18818@apollo.backplane.com> Date: Mon, 15 Feb 1999 11:43:57 -0800 (PST) From: Matthew Dillon Reply-To: dillon@apollo.backplane.com To: FreeBSD-gnats-submit@freebsd.org Subject: inode vs exec_map interlock X-Send-Pr-Version: 3.2 >Number: 10107 >Category: kern >Synopsis: interlock situation with exec_map and a program binary inode >Confidential: no >Severity: serious >Priority: medium >Responsible: remko >State: closed >Quarter: >Keywords: >Date-Required: >Class: sw-bug >Submitter-Id: current-users >Arrival-Date: Mon Feb 15 11:50:01 PST 1999 >Closed-Date: Thu Nov 16 08:31:14 GMT 2006 >Last-Modified: Thu Nov 16 08:31:14 GMT 2006 >Originator: Matthew Dillon >Release: FreeBSD 4.0-CURRENT i386 >Organization: none >Environment: Heavily loaded test machine artificially limited to 16MB of main memory, NFS swap, running a buildworld -j10. >Description: I found an interesting interlock situation between what I believe to be a program binaries inode and the exec_map. The machine locked up trying to exec new programs. This was running a make -j10 buildworld on a machine with 16MB of ram configured, while testing my new VM system. I don't think the lockup is due to my VM system, though. It took it 7 hours of extremely heavy paging before it locked up. When I broke the machine out into DDB and did a ps, all of the cc's were stuck in 'inode' wait, while a single ld program was stuck in 'thrd_sleep'. I tracked 'thrd_sleep' down to a vm_map lock and the map down to the exec_map. I tracked down the inode lock to the 'cc' program binary. The inode had one shared lock and 7 waiters. The exec_map appears to own one shared lock with 6 waiters ( but most of the waiters are due to me trying to run other programs before breaking into the DDB ). I am guessing that there is an interlock situation with exec_map and a program inode where one process locks exec_map followed by the program inode, and another locks the program inode followed by exec_map. But I'm not familiar with that section of the code so I would appreciate any help. >How-To-Repeat: The problem was found by running a make -j10 buildworld on a machine artificially limited to 16MB of main memory, with NFS swap. The problem occured approximately 7 hours into the buildworld so it is presumably difficult to recreate and represents a small window somewhere. >Fix: Unknown as yet. >Release-Note: >Audit-Trail: Responsible-Changed-From-To: freebsd-bugs->dillon Responsible-Changed-By: johan Responsible-Changed-When: Thu Aug 10 23:43:57 PDT 2000 Responsible-Changed-Why: Let Matt handle his own PRs. http://www.freebsd.org/cgi/query-pr.cgi?pr=10107 Responsible-Changed-From-To: dillon->freebsd-bugs Responsible-Changed-By: keramida Responsible-Changed-When: Sat Feb 22 18:13:46 PST 2003 Responsible-Changed-Why: Back to the free pool. http://www.freebsd.org/cgi/query-pr.cgi?pr=10107 State-Changed-From-To: open->feedback State-Changed-By: remko State-Changed-When: Sun Nov 12 08:34:14 UTC 2006 State-Changed-Why: Hello Matthew, Can you tell me whether this got solved during the last 'time'. http://www.freebsd.org/cgi/query-pr.cgi?pr=10107 Responsible-Changed-From-To: freebsd-bugs->remko Responsible-Changed-By: remko Responsible-Changed-When: Sun Nov 12 08:35:33 UTC 2006 Responsible-Changed-Why: I'll take it. http://www.freebsd.org/cgi/query-pr.cgi?pr=10107 State-Changed-From-To: feedback->closed State-Changed-By: remko State-Changed-When: Thu Nov 16 08:31:10 UTC 2006 State-Changed-Why: Matthew reports that this problem is likely to be fixed already after 6 years. http://www.freebsd.org/cgi/query-pr.cgi?pr=10107 >Unformatted: