Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid resetting the VF if the netns does not exist #313

Open
wants to merge 2 commits into
base: master
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
44 changes: 28 additions & 16 deletions cmd/sriov/main.go
Original file line number Diff line number Diff line change
Expand Up @@ -270,6 +270,29 @@ func cmdDel(args *skel.CmdArgs) error {

sm := sriov.NewSriovManager()

netns, err := ns.GetNS(args.Netns)
if err != nil {
// according to:
// https://github.com/kubernetes/kubernetes/issues/43014#issuecomment-287164444
// if provided path does not exist (e.x. when node was restarted)
// plugin should silently return with success after releasing
// IPAM resources
_, ok := err.(ns.NSPathNotExistErr)
if ok {
logging.Debug("Exiting as the network namespace does not exists anymore",
"func", "cmdDel",
"netConf.DeviceID", netConf.DeviceID,
"args.Netns", args.Netns)
return nil
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in this case dont we want to skip to L315 and mark pci address as released ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's better not:

The scenario I'm trying to solve here is when a CmdDelete is called when the device has already been allocated to another Pod. If we call DeleteAllocatedPCI(...), we are freeing up a device on a running pod.

WDYT?

Copy link
Contributor

@adrianchiris adrianchiris Dec 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

took another look at the codebase, can this really happen ? we use the pci allocator to check if the device is allocated in CmdAdd, see [1]

so even if NS was deleted and device was re-assigned to a new pod, its CNI call would fail on ADD until we released the device.

if my understanding of the above is true then why dont we want to reset Vf config in this case ?

[1]

isAllocated, err := allocator.IsAllocated(n.DeviceID)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right; in the scenario I just described, it probably can't happen.
What we saw happening is related to a very long execution of ipam.ExecDel(...).
This scenario includes 3 sriov-cni calls:
a. first_cniDEL starts and call ipam.ExecDel(...). this is going to take a very long time (~2 minutes)
b. second_cniDEL is invoked and, for some reason, it completes quite quickly (ipam deletion, vlan/mac reset, ...). PIC is deallocated, netns is deleted.
c. cniADD is invoked. It assigns the VF to a new pod, configuring VLAN, MAC, ...
d. first_cniDEL continues its execution, as the ipam.ExecDel finished. Here, what I want to avoid is resetting the VF MAC, VLAN and so on.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe a better solution is to use lock file to handle on a single cmd ADD/DEL at a time for a given ns, network, interface ?
that way second cni DEL will fail and prevent freeing up the PCI in the allocator ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looking at the code, we save PCI device at the end of the cmd add call.
in this case we want to "lock" the call for the ns/network/interface at the beginning.
so if we want to use the same file we need to be careful (save at the beginning and release in case of failure)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you are right so maybe having a tmp file for ns/network/interface is the right way going forward but we must be sure we clean that file.

and we cover a case of unexpected reboot so maybe putting this lock file in /tmp/ so on reboot it will get clean?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sounds good.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

instead of /tmp, we should go through all the files we have put on the fs when we start up the operator controller, and ensure our local db (files we put on the file system) is clean. There is no guarantee that /tmp will be cleared.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can flock the pci file if it works. just need to make sure we release when we exit ADD and DEL cmds.

so , according to flock docs[1] it would release the lock if the process exits so in case of reboot we would not be stuck in a deadlock (it would be released when process exits)

[1] https://man7.org/linux/man-pages/man2/flock.2.html

}

return fmt.Errorf("failed to open netns %s: %q", netns, err)
}
defer netns.Close()

logging.Debug("Reset VF configuration",
"func", "cmdDel",
"netConf.DeviceID", netConf.DeviceID)
/* ResetVFConfig resets a VF administratively. We must run ResetVFConfig
before ReleaseVF because some drivers will error out if we try to
reset netdev VF with trust off. So, reset VF MAC address via PF first.
Expand All @@ -279,22 +302,11 @@ func cmdDel(args *skel.CmdArgs) error {
}

if !netConf.DPDKMode {
netns, err := ns.GetNS(args.Netns)
if err != nil {
// according to:
// https://github.com/kubernetes/kubernetes/issues/43014#issuecomment-287164444
// if provided path does not exist (e.x. when node was restarted)
// plugin should silently return with success after releasing
// IPAM resources
_, ok := err.(ns.NSPathNotExistErr)
if ok {
return nil
}

return fmt.Errorf("failed to open netns %s: %q", netns, err)
}
defer netns.Close()

logging.Debug("Release the VF",
"func", "cmdDel",
"netConf.DeviceID", netConf.DeviceID,
"args.Netns", args.Netns,
"args.IfName", args.IfName)
if err = sm.ReleaseVF(netConf, args.IfName, netns); err != nil {
return err
}
Expand Down
Loading