From nobody@FreeBSD.org Mon Jan 2 08:15:01 2012 Return-Path: Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 1BC861065672 for ; Mon, 2 Jan 2012 08:15:01 +0000 (UTC) (envelope-from nobody@FreeBSD.org) Received: from red.freebsd.org (red.freebsd.org [IPv6:2001:4f8:fff6::22]) by mx1.freebsd.org (Postfix) with ESMTP id 002FA8FC12 for ; Mon, 2 Jan 2012 08:15:01 +0000 (UTC) Received: from red.freebsd.org (localhost [127.0.0.1]) by red.freebsd.org (8.14.4/8.14.4) with ESMTP id q028F0lX062586 for ; Mon, 2 Jan 2012 08:15:00 GMT (envelope-from nobody@red.freebsd.org) Received: (from nobody@localhost) by red.freebsd.org (8.14.4/8.14.4/Submit) id q028F07t062585; Mon, 2 Jan 2012 08:15:00 GMT (envelope-from nobody) Message-Id: <201201020815.q028F07t062585@red.freebsd.org> Date: Mon, 2 Jan 2012 08:15:00 GMT From: Boris Lytochkin To: freebsd-gnats-submit@FreeBSD.org Subject: LOR between zfs&syncer + vnlru leading to ZFS deadlock X-Send-Pr-Version: www-3.1 X-GNATS-Notify: >Number: 163770 >Category: kern >Synopsis: [zfs] [hang] LOR between zfs&syncer + vnlru leading to ZFS deadlock >Confidential: no >Severity: critical >Priority: medium >Responsible: freebsd-fs >State: open >Quarter: >Keywords: >Date-Required: >Class: sw-bug >Submitter-Id: current-users >Arrival-Date: Mon Jan 02 08:20:08 UTC 2012 >Closed-Date: >Last-Modified: Mon Jan 02 13:39:25 UTC 2012 >Originator: Boris Lytochkin >Release: RELENG_8 >Organization: Yandex >Environment: FreeBSD skylla.yandex.net 8.2-STABLE FreeBSD 8.2-STABLE #7: Tue Dec 27 19:54:33 MSK 2011 lytboris@skylla.yandex.net:/usr/obj/usr/src/sys/CACTI amd64 >Description: Deadlocks are seen periodically, most of them can be triggered with low kern.maxvnodes and running /etc/periodic/security/100.chksetuid (a big find in it). LOR itself as seen couple of minutes after server boots into multiuser: --- syscall (5, FreeBSD ELF64, open), rip = 0x800f8666c, rsp = 0x7fffffffe8d8, rbp = 0x1b0 --- lock order reversal: 1st 0xffffff00508c1098 syncer (syncer) @ /usr/src/sys/kern/vfs_subr.c:1737 2nd 0xffffff108f8abba8 zfs (zfs) @ /usr/src/sys/kern/vfs_subr.c:2137 KDB: stack backtrace: db_trace_self_wrapper() at db_trace_self_wrapper+0x2a kdb_backtrace() at kdb_backtrace+0x37 _witness_debugger() at _witness_debugger+0x2c witness_checkorder() at witness_checkorder+0x651 __lockmgr_args() at __lockmgr_args+0xb98 vop_stdlock() at vop_stdlock+0x39 VOP_LOCK1_APV() at VOP_LOCK1_APV+0x52 _vn_lock() at _vn_lock+0x47 vget() at vget+0x56 vfs_msync() at vfs_msync+0xa5 sync_fsync() at sync_fsync+0x12a VOP_FSYNC_APV() at VOP_FSYNC_APV+0x4a sync_vnode() at sync_vnode+0x157 sched_sync() at sched_sync+0x1b1 fork_exit() at fork_exit+0x11d fork_trampoline() at fork_trampoline+0xe --- trap 0, rip = 0, rsp = 0xffffff9b0b285d00, rbp = 0 --- When deadlock appears, syncer gets stuck in zilog sync: procstat: 20 100179 vnlru - mi_switch sleepq_timedwait _sleep zfs_zget zfs_get_data zil_commit zfs_freebsd_write VOP_WRITE_APV vnode_pager_generic_putpages VOP_PUTPAGES_APV vnode_pager_putpages vm_pageout_flush vm_object_page_collect_flush vm_object_page_clean vm_object_terminate vnode_destroy_vobject zfs_freebsd_reclaim VOP_RECLAIM_APV procstat: 21 100180 syncer - mi_switch sleepq_wait _cv_wait zil_commit zfs_sync sync_fsync VOP_FSYNC_APV sync_vnode sched_sync fork_exit fork_trampoline debug.procstat:63429 101808 rrdtool - mi_switch sleepq_catch_signals sleepq_wait_sig _sleep pipe_read dofileread kern_readv read syscallenter syscall Xfast_syscall procstat:63430 101562 rrdtool - mi_switch sleepq_catch_signals sleepq_wait_sig _sleep pipe_read dofileread kern_readv read syscallenter syscall Xfast_syscall ps: 0 20 0 0 46 0 0 16 zcolli DL ?? 1:36.89 [vnlru] ps: 0 21 0 0 51 0 0 16 zilog- DL ?? 151:27.35 [syncer] ps: 80 63429 63419 0 48 0 13280 2032 piperd I ?? 2:36.48 /usr/local/bin/rrdtool - ps: 80 63430 63419 0 76 0 13280 1712 piperd I ?? 0:00.01 /usr/local/bin/rrdtool - At this moment these locks are held: db> show alllocks Process 69468 (sshd) thread 0xffffff0f49769460 (102462) exclusive sx so_rcv_sx (so_rcv_sx) r = 0 (0xffffff131329a0f8) locked @ /usr/src/sys/kern/uipc_sockbuf.c:148 Process 63419 (php) thread 0xffffff115a574000 (102314) shared lockmgr zfs (zfs) r = 0 (0xffffff0e7822e448) locked @ /usr/src/sys/kern/vfs_subr.c:2137 Process 21 (syncer) thread 0xffffff0027d96000 (100180) exclusive lockmgr syncer (syncer) r = 0 (0xffffff0045d94098) locked @ /usr/src/sys/kern/vfs_subr.c:1737 Process 20 (vnlru) thread 0xffffff0027d96460 (100179) exclusive lockmgr zfs (zfs) r = 0 (0xffffff140637f620) locked @ /usr/src/sys/kern/vfs_subr.c:2249 Excerpt from vfs_subr.c: 1732: vdrop(vp); mtx_lock(&sync_mtx); return (*bo == LIST_FIRST(slp)); } vn_lock(vp, LK_EXCLUSIVE | LK_RETRY); (void) VOP_FSYNC(vp, MNT_LAZY, td); VOP_UNLOCK(vp, 0); vn_finished_write(mp); BO_LOCK(*bo); if (((*bo)->bo_flag & BO_ONWORKLST) != 0) { 2132: if ((flags & LK_INTERLOCK) == 0) VI_LOCK(vp); vholdl(vp); if ((error = vn_lock(vp, flags | LK_INTERLOCK)) != 0) { vdrop(vp); CTR2(KTR_VFS, "%s: impossible to lock vnode %p", __func__, vp); return (error); } 2244: */ vp->v_iflag |= VI_OWEINACT; switch (func) { case VPUTX_VRELE: error = vn_lock(vp, LK_EXCLUSIVE | LK_INTERLOCK); VI_LOCK(vp); break; case VPUTX_VPUT: if (VOP_ISLOCKED(vp) != LK_EXCLUSIVE) { error = VOP_LOCK(vp, LK_UPGRADE | LK_INTERLOCK | Deadlock is observed per FS: other ZFS filesystems within ZFS pool that contains deadlocked FS, are working OK. >How-To-Repeat: Create ZFS fs with tons (300k) of RRD files and update them periodically >Fix: It seems that setting kern.maxvnodes to higher value may cope with this hang, at least 100.chksetuid can be executed without triggering deadlock. >Release-Note: >Audit-Trail: Responsible-Changed-From-To: freebsd-bugs->freebsd-fs Responsible-Changed-By: linimon Responsible-Changed-When: Mon Jan 2 13:39:00 UTC 2012 Responsible-Changed-Why: Over to maintainer(s). http://www.freebsd.org/cgi/query-pr.cgi?pr=163770 >Unformatted: