Avoid lazy umount MNT_DETACH for NFS mounts as this causes system hang #47

rhmcruiser · 2020-03-05T08:01:00Z

In src/oci-umount.c

Lazy unmount of NFS mounts causes serious problems of even holding up the reboot process. It causes hung tasks which are doing NFS IO, we see them in vmcore in UNinterruptible state . In some cases we have seen the NFS IO tasks in blocked UN state not even allowing the reboot/shutdown task to progress as the shutdown gets into blocked UN state waiting for NFS superblock syncing of inodes before shutdown. But the NFS tasks wont progress and are blocked waiting to complete IO. Although the lazy umount removes it from the mount table we have seen the superblock.s_count and s_active reflecting it in use and holding the number of references by several tasks for NFS IO.

MNT_DETACH does not actually unmount a file-system which is in-use; it just detaches the mount from the visible filesystem tree, and makes it impossible to see what processes are still using the mount. This prevents normal shutdown of systems, due to continued access to the mount.

And issue is confirmed to happen only in dockers/containers environment is being used. And the two notable places of lazy umount are in oci-umount.c

== Details, snippet from vmcore analysis:
The below shows the nfsv4 superblock still holding a reference count although it is not in the mount table.

crash> mount | grep ffff9a0018ae2000
crash> << although mount is removed from filesystem tree due to lazy umount

the superblock fields have references and tasks wait for NFS IO onto this superblock

crash> p ((struct super_block*)0xffff9a0018ae2000)->s_op
$13 = (const struct super_operations *) 0xffffffffc0901b60 <nfs4_sops>

crash> p ((struct super_block*)0xffff9a0018ae2000)->s_count
$12 = 2 << usage count is still positive

crash> p ((struct super_block*)0xffff9a0018ae2000)->s_active
$14 = {
counter = 4 << 4 blocked tasks holding reference to this nfs share
}
The 4 blocked tasks on this lazy umounted NFS superblock were
crash> ps -m | grep UN
[0 00:10:44.049] [UN] PID: 2136 TASK: ffff99f3fef64f10 CPU: 17 COMMAND: "poweroff" => blocked performing sync_inodes_sb( )
[0 00:10:55.262] [UN] PID: 31589 TASK: ffff99ff78ba0000 CPU: 7 COMMAND: "java" => blocked for nfs_file_write( )
[0 00:10:55.345] [UN] PID: 62574 TASK: ffff99f44baa0000 CPU: 17 COMMAND: "java" => blocked for nfs_file_write( )
[0 00:11:02.028] [UN] PID: 63909 TASK: ffff99ed7cffaf70 CPU: 10 COMMAND: "prometheus" => blocked for nfs_file_write( )

Signed-off-by: Ronald Monthero [email protected]

In src/oci-umount.c Lazy unmount of NFS mounts causes serious problems of even holding up the reboot process. It causes hung tasks which are doing NFS IO, we see them in vmcore in UNinterruptible state . In some cases we have seen the NFS IO tasks in blocked UN state not even allowing the reboot/shutdown task to progress as the shutdown gets into blocked UN state waiting for NFS superblock syncing of inodes before shutdown. But the NFS tasks wont progress and are blocked waiting to complete IO. Although the lazy umount removes it from the mount table we have seen the superblock.s_count and s_active reflecting it in use and holding the number of references by several tasks for NFS IO. MNT_DETACH does not actually unmount a file-system which is in-use; it just detaches the mount from the visible filesystem tree, and makes it impossible to see what processes are still using the mount. This prevents normal shutdown of systems, due to continued access to the mount. And issue is confirmed to happen only in dockers/containers environment is being used. And the two notable places of lazy umount are in oci-umount.c == Details, snippet from vmcore analysis: The below shows the nfsv4 superblock still holding a reference count although it is not in the mount table. crash> mount | grep ffff9a0018ae2000 crash> << although mount is removed from filesystem tree due to lazy umount the superblock fields have references and tasks wait for NFS IO onto this superblock crash> p ((struct super_block*)0xffff9a0018ae2000)->s_op $13 = (const struct super_operations *) 0xffffffffc0901b60 <nfs4_sops> crash> p ((struct super_block*)0xffff9a0018ae2000)->s_count $12 = 2 << usage count is still positive crash> p ((struct super_block*)0xffff9a0018ae2000)->s_active $14 = { counter = 4 << 4 blocked tasks holding reference to this nfs share } The 4 blocked tasks on this lazy umounted NFS superblock were crash> ps -m | grep UN [0 00:10:44.049] [UN] PID: 2136 TASK: ffff99f3fef64f10 CPU: 17 COMMAND: "poweroff" => blocked performing sync_inodes_sb( ) [0 00:10:55.262] [UN] PID: 31589 TASK: ffff99ff78ba0000 CPU: 7 COMMAND: "java" => blocked for nfs_file_write( ) [0 00:10:55.345] [UN] PID: 62574 TASK: ffff99f44baa0000 CPU: 17 COMMAND: "java" => blocked for nfs_file_write( ) [0 00:11:02.028] [UN] PID: 63909 TASK: ffff99ed7cffaf70 CPU: 10 COMMAND: "prometheus" => blocked for nfs_file_write( ) Signed-off-by: Ronald Monthero <[email protected]>

rhmcruiser and others added 2 commits March 5, 2020 17:52

cleaned up my earlier commit's mistake

5956b37

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Avoid lazy umount MNT_DETACH for NFS mounts as this causes system hang #47

Avoid lazy umount MNT_DETACH for NFS mounts as this causes system hang #47

rhmcruiser commented Mar 5, 2020

Avoid lazy umount MNT_DETACH for NFS mounts as this causes system hang #47

Are you sure you want to change the base?

Avoid lazy umount MNT_DETACH for NFS mounts as this causes system hang #47

Conversation

rhmcruiser commented Mar 5, 2020