Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid lazy umount MNT_DETACH for NFS mounts as this causes system hang #47

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

rhmcruiser
Copy link

In src/oci-umount.c

Lazy unmount of NFS mounts causes serious problems of even holding up the reboot process. It causes hung tasks which are doing NFS IO, we see them in vmcore in UNinterruptible state . In some cases we have seen the NFS IO tasks in blocked UN state not even allowing the reboot/shutdown task to progress as the shutdown gets into blocked UN state waiting for NFS superblock syncing of inodes before shutdown. But the NFS tasks wont progress and are blocked waiting to complete IO. Although the lazy umount removes it from the mount table we have seen the superblock.s_count and s_active reflecting it in use and holding the number of references by several tasks for NFS IO.

MNT_DETACH does not actually unmount a file-system which is in-use; it just detaches the mount from the visible filesystem tree, and makes it impossible to see what processes are still using the mount. This prevents normal shutdown of systems, due to continued access to the mount.

And issue is confirmed to happen only in dockers/containers environment is being used. And the two notable places of lazy umount are in oci-umount.c

== Details, snippet from vmcore analysis:
The below shows the nfsv4 superblock still holding a reference count although it is not in the mount table.

crash> mount | grep ffff9a0018ae2000
crash> << although mount is removed from filesystem tree due to lazy umount

the superblock fields have references and tasks wait for NFS IO onto this superblock

crash> p ((struct super_block*)0xffff9a0018ae2000)->s_op
$13 = (const struct super_operations *) 0xffffffffc0901b60 <nfs4_sops>

crash> p ((struct super_block*)0xffff9a0018ae2000)->s_count
$12 = 2 << usage count is still positive

crash> p ((struct super_block*)0xffff9a0018ae2000)->s_active
$14 = {
counter = 4 << 4 blocked tasks holding reference to this nfs share
}
The 4 blocked tasks on this lazy umounted NFS superblock were
crash> ps -m | grep UN
[0 00:10:44.049] [UN] PID: 2136 TASK: ffff99f3fef64f10 CPU: 17 COMMAND: "poweroff" => blocked performing sync_inodes_sb( )
[0 00:10:55.262] [UN] PID: 31589 TASK: ffff99ff78ba0000 CPU: 7 COMMAND: "java" => blocked for nfs_file_write( )
[0 00:10:55.345] [UN] PID: 62574 TASK: ffff99f44baa0000 CPU: 17 COMMAND: "java" => blocked for nfs_file_write( )
[0 00:11:02.028] [UN] PID: 63909 TASK: ffff99ed7cffaf70 CPU: 10 COMMAND: "prometheus" => blocked for nfs_file_write( )

Signed-off-by: Ronald Monthero [email protected]

rhmcruiser and others added 2 commits March 5, 2020 17:52
In src/oci-umount.c

 Lazy unmount of NFS mounts causes serious problems of even holding up the reboot process. It causes hung tasks which are doing NFS IO,  we see them in vmcore in UNinterruptible state . In some cases we have  seen the NFS IO tasks in blocked UN state not even allowing the reboot/shutdown task to progress as the  shutdown gets into  blocked UN state waiting for NFS superblock syncing of inodes before shutdown.  But the NFS tasks wont progress and are blocked waiting to complete IO.  Although the lazy umount removes it from the mount table we have seen the superblock.s_count  and s_active reflecting it in use and holding the number of references by several tasks for NFS IO.

MNT_DETACH does not actually unmount a file-system which is in-use; it just detaches the mount from the visible filesystem tree, and makes it impossible to see what processes are still using the mount.  This prevents normal shutdown of systems, due to continued access to the mount.

And issue is confirmed to happen only in dockers/containers environment is being used. And the two notable places of lazy umount  are in  oci-umount.c

== Details, snippet from  vmcore analysis:
The below shows the nfsv4  superblock  still holding a reference count although it is not in the mount table.

crash> mount | grep ffff9a0018ae2000
crash>                 << although  mount is removed from filesystem tree due to lazy umount

the superblock fields have references and tasks wait for NFS IO onto this superblock

crash> p ((struct super_block*)0xffff9a0018ae2000)->s_op
$13 = (const struct super_operations *) 0xffffffffc0901b60 <nfs4_sops>

crash> p ((struct super_block*)0xffff9a0018ae2000)->s_count
$12 = 2             << usage count is still positive

crash> p ((struct super_block*)0xffff9a0018ae2000)->s_active
$14 = {
  counter = 4       << 4 blocked tasks holding reference to this nfs share
}
The 4 blocked tasks on this lazy umounted NFS superblock were
crash> ps -m | grep UN
[0 00:10:44.049] [UN]  PID: 2136   TASK: ffff99f3fef64f10  CPU: 17  COMMAND: "poweroff"   =>  blocked performing sync_inodes_sb( )
[0 00:10:55.262] [UN]  PID: 31589  TASK: ffff99ff78ba0000  CPU: 7   COMMAND: "java"   =>  blocked for nfs_file_write( )
[0 00:10:55.345] [UN]  PID: 62574  TASK: ffff99f44baa0000  CPU: 17  COMMAND: "java"  => blocked for nfs_file_write( )
[0 00:11:02.028] [UN]  PID: 63909  TASK: ffff99ed7cffaf70  CPU: 10  COMMAND: "prometheus"  => blocked for nfs_file_write( )

Signed-off-by: Ronald Monthero <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant