Quantcast

Re: ZFS panic with concurrent recv and read-heavy workload

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

Re: ZFS panic with concurrent recv and read-heavy workload

Nathaniel Filardo
I just got this on another machine, no heavy workload needed, just booting
and starting some jails.  Of interest, perhaps, both this and the machine
triggering the below panic are SMP V240s with 1.5GHz CPUs (though I will
confess that the machine in the original report may have had bad RAM).  I
have run a UP 1.2GHz V240 for months and never seen this panic.

This time the kernel is
> FreeBSD 9.0-CURRENT #9: Fri Jun  3 02:32:13 EDT 2011
csup'd immediately before building.  The full panic this time is

> panic: Lock buf_hash_table.ht_locks[i].ht_lock not exclusively locked @
> /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c:4659
>
> cpuid = 1
> KDB: stack backtrace:
> panic() at panic+0x1c8
> _sx_assert() at _sx_assert+0xc4
> _sx_xunlock() at _sx_xunlock+0x98
> l2arc_feed_thread() at l2arc_feed_thread+0xeac
> fork_exit() at fork_exit+0x9c
> fork_trampoline() at fork_trampoline+0x8
>
> SC Alert: SC Request to send Break to host.
> KDB: enter: Line break on console
> [ thread pid 27 tid 100121 ]
> Stopped at      kdb_enter+0x80: ta              %xcc, 1
> db> reset
> ttiimmeeoouutt  sshhuuttttiinngg  ddoowwnn  CCPPUUss..
Half of the memory in this machine is new (well, came with the machine) and
half is from the aforementioned UP V240 which seemed to work fine (I was
attempting an upgrade when this happened); none of it (or indeed any of the
hardware save the disk controller and disks) are common between this and the
machine reporting below.

Thoughts?  Any help would be greatly appreciated.
Thanks.
--nwf;

On Wed, Apr 06, 2011 at 04:00:43AM -0400, Nathaniel W Filardo wrote:

>[...]
> panic: Lock buf_hash_table.ht_locks[i].ht_lock not exclusively locked @ /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c:1869
>
> cpuid = 1
> KDB: stack backtrace:
> panic() at panic+0x1c8
> _sx_assert() at _sx_assert+0xc4
> _sx_xunlock() at _sx_xunlock+0x98
> arc_evict() at arc_evict+0x614
> arc_get_data_buf() at arc_get_data_buf+0x360
> arc_buf_alloc() at arc_buf_alloc+0x94
> dmu_buf_will_fill() at dmu_buf_will_fill+0xfc
> dmu_write() at dmu_write+0xec
> dmu_recv_stream() at dmu_recv_stream+0x8a8
> zfs_ioc_recv() at zfs_ioc_recv+0x354
> zfsdev_ioctl() at zfsdev_ioctl+0xe0
> devfs_ioctl_f() at devfs_ioctl_f+0xe8
> kern_ioctl() at kern_ioctl+0x294
> ioctl() at ioctl+0x198
> syscallenter() at syscallenter+0x270
> syscall() at syscall+0x74
> -- syscall (54, FreeBSD ELF64, ioctl) %o7=0x40c13e24 --
> userland() at 0x40e72cc8
> user trace: trap %o7=0x40c13e24
> pc 0x40e72cc8, sp 0x7fdffff4641
> pc 0x40c158f4, sp 0x7fdffff4721
> pc 0x40c1e878, sp 0x7fdffff47f1
> pc 0x40c1ce54, sp 0x7fdffff8b01
> pc 0x40c1dbe0, sp 0x7fdffff9431
> pc 0x40c1f718, sp 0x7fdffffd741
> pc 0x10731c, sp 0x7fdffffd831
> pc 0x10c90c, sp 0x7fdffffd8f1
> pc 0x103ef0, sp 0x7fdffffe1d1
> pc 0x4021aff4, sp 0x7fdffffe291
> done
>[...]

attachment0 (205 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

Re: ZFS panic with concurrent recv and read-heavy workload

Marius Strobl
On Fri, Jun 03, 2011 at 03:03:56AM -0400, Nathaniel W Filardo wrote:

> I just got this on another machine, no heavy workload needed, just booting
> and starting some jails.  Of interest, perhaps, both this and the machine
> triggering the below panic are SMP V240s with 1.5GHz CPUs (though I will
> confess that the machine in the original report may have had bad RAM).  I
> have run a UP 1.2GHz V240 for months and never seen this panic.
>
> This time the kernel is
> > FreeBSD 9.0-CURRENT #9: Fri Jun  3 02:32:13 EDT 2011
> csup'd immediately before building.  The full panic this time is
> > panic: Lock buf_hash_table.ht_locks[i].ht_lock not exclusively locked @
> > /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c:4659
> >
> > cpuid = 1
> > KDB: stack backtrace:
> > panic() at panic+0x1c8
> > _sx_assert() at _sx_assert+0xc4
> > _sx_xunlock() at _sx_xunlock+0x98
> > l2arc_feed_thread() at l2arc_feed_thread+0xeac
> > fork_exit() at fork_exit+0x9c
> > fork_trampoline() at fork_trampoline+0x8
> >
> > SC Alert: SC Request to send Break to host.
> > KDB: enter: Line break on console
> > [ thread pid 27 tid 100121 ]
> > Stopped at      kdb_enter+0x80: ta              %xcc, 1
> > db> reset
> > ttiimmeeoouutt  sshhuuttttiinngg  ddoowwnn  CCPPUUss..
>
> Half of the memory in this machine is new (well, came with the machine) and
> half is from the aforementioned UP V240 which seemed to work fine (I was
> attempting an upgrade when this happened); none of it (or indeed any of the
> hardware save the disk controller and disks) are common between this and the
> machine reporting below.
>
> Thoughts?  Any help would be greatly appreciated.
> Thanks.
> --nwf;
>
> On Wed, Apr 06, 2011 at 04:00:43AM -0400, Nathaniel W Filardo wrote:
> >[...]
> > panic: Lock buf_hash_table.ht_locks[i].ht_lock not exclusively locked @ /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c:1869
> >
> > cpuid = 1
> > KDB: stack backtrace:
> > panic() at panic+0x1c8
> > _sx_assert() at _sx_assert+0xc4
> > _sx_xunlock() at _sx_xunlock+0x98
> > arc_evict() at arc_evict+0x614
> > arc_get_data_buf() at arc_get_data_buf+0x360
> > arc_buf_alloc() at arc_buf_alloc+0x94
> > dmu_buf_will_fill() at dmu_buf_will_fill+0xfc
> > dmu_write() at dmu_write+0xec
> > dmu_recv_stream() at dmu_recv_stream+0x8a8
> > zfs_ioc_recv() at zfs_ioc_recv+0x354
> > zfsdev_ioctl() at zfsdev_ioctl+0xe0
> > devfs_ioctl_f() at devfs_ioctl_f+0xe8
> > kern_ioctl() at kern_ioctl+0x294
> > ioctl() at ioctl+0x198
> > syscallenter() at syscallenter+0x270
> > syscall() at syscall+0x74
> > -- syscall (54, FreeBSD ELF64, ioctl) %o7=0x40c13e24 --
> > userland() at 0x40e72cc8
> > user trace: trap %o7=0x40c13e24
> > pc 0x40e72cc8, sp 0x7fdffff4641
> > pc 0x40c158f4, sp 0x7fdffff4721
> > pc 0x40c1e878, sp 0x7fdffff47f1
> > pc 0x40c1ce54, sp 0x7fdffff8b01
> > pc 0x40c1dbe0, sp 0x7fdffff9431
> > pc 0x40c1f718, sp 0x7fdffffd741
> > pc 0x10731c, sp 0x7fdffffd831
> > pc 0x10c90c, sp 0x7fdffffd8f1
> > pc 0x103ef0, sp 0x7fdffffe1d1
> > pc 0x4021aff4, sp 0x7fdffffe291
> > done
> >[...]

Apparently this is a locking issue in the ARC code, the ZFS people should
be able to help you.

Marius

_______________________________________________
[hidden email] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-sparc64
To unsubscribe, send any mail to "[hidden email]"
Loading...