Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: Remove CudaContextManager and use cudaSetDevice #34

Merged

Conversation

laurendeaumatthieu
Copy link

No description provided.

@SimonRit
Copy link
Collaborator

SimonRit commented Feb 9, 2024

Thanks for the contribution. I think you are actually cleaning up legacy code which is no longer relevant since the introduction of a primary context in 2015... See Cuda 7 release notes.
I would suggest further cleaning up the code:

  • remove the ContextManager class
  • replace all calls to cuSetSetCurrent to calls to cudasetDevice following what is here.

Should I handle it or do you want to give it a try?

@laurendeaumatthieu
Copy link
Author

I can give a try.
What do you mean by "replace all calls to cuSetSetCurrent to calls to cudasetDevice following what is here" ? I have removed all the cuCtxSetCurrent, you would like I replace them by CUDA_CHECK(cudaSetDevice(0)); ?

@SimonRit
Copy link
Collaborator

SimonRit commented Feb 9, 2024

Sorry for the typo. Yes, that's what I mean and what the link suggests. It shouldn't be 0 but the number of the selected device. This selection could be done in the constructor.

@laurendeaumatthieu
Copy link
Author

I made the changes. I let you run the tests. Please let me know if everything is ok for you

Matthieu

@SimonRit
Copy link
Collaborator

Thanks looks good. The CI indicates that you didn't commit the CMakeLists.txt file: https://my.cdash.org/build/2487527/configure. Can you also squash the changes in one commit only?
Be careful to have a commit subject "below 72 c, ideally 50" (see here). And it would be nice to have a bit more substance in the commit message to explain why we don't need a Cuda context anymore and why we think we need to reinitialize the device before every access (with the links I provided).

@laurendeaumatthieu laurendeaumatthieu changed the title BUG: modify CudaContextManager() with cudaSetDevice() and remove GetCurrentContext() BUG: Remove CudaContextManager and use cudaSetDevice Feb 12, 2024
@laurendeaumatthieu
Copy link
Author

Should be ok now

@SimonRit
Copy link
Collaborator

Thanks. I'm surprised that it compiles locally because there are still references to the context manager. At least the CI doesn't build https://my.cdash.org/viewBuildError.php?buildid=2488764, can you fix it?

I also think the commit body is wrong as is (since the Cuda context was not destroyed) + the list of CHANGES is not necessary so I suggest the following message for the commit body:

Use the primary context with cudaSetDevice() introduced by Cuda 7 (https://developer.download.nvidia.com/compute/cuda/7_0/Prod/doc/CUDA_Toolkit_Release_Notes.pdf) instead of a new one. cudaSetDevice is called before every memory transfer between the CPU and GPU to be sure that the context is set for the current thread, see https://developer.nvidia.com/blog/cuda-pro-tip-always-set-current-device-avoid-multithreading-bugs/.

@@ -16,15 +16,13 @@
*
*=========================================================================*/

#include "itkCudaDataManager.h"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, sorry

@@ -24,7 +24,6 @@ namespace itk
// constructor
CudaDataManager::CudaDataManager()
{
m_ContextManager = CudaContextManager::GetInstance();

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you remove this long comment too? I think it was already obsolete. Sorry, I should have caught this before...

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Use the primary context with cudaSetDevice() introduced by Cuda 7 (https://developer.download.nvidia.com/compute/cuda/7_0/Prod/doc/CUDA_Toolkit_Release_Notes.pdf) instead of a new one. cudaSetDevice is called before every memory transfer between the CPU and GPU to be sure that the context is set for the current thread, see https://developer.nvidia.com/blog/cuda-pro-tip-always-set-current-device-avoid-multithreading-bugs/.
@SimonRit
Copy link
Collaborator

Thanks. Failing python packages are being solved in a separate issue, I'm merging this one.

@SimonRit SimonRit merged commit 09a9645 into RTKConsortium:master Feb 14, 2024
7 of 18 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants