Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clean Systemd files on exit #792

Closed

Conversation

souleb
Copy link
Contributor

@souleb souleb commented Oct 14, 2024

We have run into an issue where removing the operator leaves behind the systemd files.

If implemented all systemD related files will be cleaned up on exit on non openshift distributions.

Copy link

Thanks for your PR,
To run vendors CIs, Maintainers can use one of:

  • /test-all: To run all tests for all vendors.
  • /test-e2e-all: To run all E2E tests for all vendors.
  • /test-e2e-nvidia-all: To run all E2E tests for NVIDIA vendor.

To skip the vendors CIs, Maintainers can use one of:

  • /skip-all: To skip all tests for all vendors.
  • /skip-e2e-all: To skip all E2E tests for all vendors.
  • /skip-e2e-nvidia-all: To skip all E2E tests for NVIDIA vendor.
    Best regards.

signal.Notify(sigc, syscall.SIGTERM)

errChan := make(chan error)
defer close(errChan)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I used the method used here to close channels. But I believe there is a race in general, on exit (once main goes away, or if os.Exit is called) the goroutines are killed off immediately. So the goroutines listening on this channel will never get the chance to run their intended actions.

@souleb
Copy link
Contributor Author

souleb commented Oct 14, 2024

cc @e0ne

@coveralls
Copy link

coveralls commented Oct 14, 2024

Pull Request Test Coverage Report for Build 11341895365

Details

  • 10 of 52 (19.23%) changed or added relevant lines in 2 files are covered.
  • 1 unchanged line in 1 file lost coverage.
  • Overall coverage decreased (-0.05%) to 44.967%

Changes Missing Coverage Covered Lines Changed/Added Lines %
pkg/daemon/daemon.go 10 15 66.67%
cmd/sriov-network-config-daemon/start.go 0 37 0.0%
Files with Coverage Reduction New Missed Lines %
controllers/drain_controller.go 1 68.06%
Totals Coverage Status
Change from base Build 11294509064: -0.05%
Covered Lines: 6669
Relevant Lines: 14831

💛 - Coveralls

If implemented all systemD related files will be cleaned up on exit.
@@ -207,6 +207,14 @@ func (dn *Daemon) Run(stopCh <-chan struct{}, exitCh <-chan error) error {
for {
select {
case <-stopCh:
// clean files from host if we are running in systemd mode
if vars.UsingSystemdMode {
err := systemd.CleanSriovFilesFromHost(vars.ClusterType == consts.ClusterTypeOpenshift)
Copy link
Collaborator

@adrianchiris adrianchiris Oct 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIUC when the daemon is stopped we delete systemd files.

if the node reboots (given pods get stop signal) wont it delete the files ? which in turn mean the node will not have SR-IOV configured on startup ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What you say is correct. We should have a way to identify whether a reboot has been requested. In that case the cleanUp has to be a best effort because we can have an ongoing reboot and also a spec change in the daemonSet.

@souleb souleb closed this Oct 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants