Hipify various dependencies to enable AMD Face Enhancer

Summary: This diff extends several targets to be hip compatible and fixes a few silly hipification issues with those targets. After these changes, all dependencies needed for the face enhancer can compile with AMD. A few silly issues that I had to hack around, maybe we could improve hipification to avoid similar issues in the future: * Some of the dependencies used sources in `src/cuda/**.cu`. Hipification tried to rename "cuda" to "hip" and broke the paths. I'm not sure where that rename happens so I just changed the directory from "cuda" to "gpu" to avoid the issue. * One header import called `THCAtomics.cuh` was incorrectly being renamed to `THHAtomics.cuh`, which doesnt exist. Fortunately an equivalent import that doesnt have name issues was available. We also might want to consider graduating the cpp_library_hip bazel helper out of fbgemm since it seems pretty generally useful. For some of the targets, we needed to build a python cpp extension, which as far as I can tell we didnt have good hipification for yet. I added a new buck rule very similar to our standard cpp_library_hip rule that creates an extension instead. It's a little copy-pasted so if there are cleaner ways to work around this requirement let me know. Reviewed By: jianyuh Differential Revision: D61080247
facebookresearch · Aug 12, 2024 · 3d253db · 3d253db
1 parent e09224b
commit 3d253db
Show file tree

Hide file tree

Showing 4 changed files with 158 additions and 160 deletions.
diff --git a/...detr/detr/src/cuda/ms_deform_attn_cuda.cu → .../detr/detr/src/gpu/ms_deform_attn_cuda.cu b/...detr/detr/src/cuda/ms_deform_attn_cuda.cu → .../detr/detr/src/gpu/ms_deform_attn_cuda.cu
@@ -9,7 +9,7 @@
 */
 
 #include <vector>
-#include "cuda/ms_deform_im2col_cuda.cuh"
+#include "ms_deform_im2col_cuda.cuh"
 
 #include <ATen/ATen.h>
 #include <ATen/cuda/CUDAContext.h>
@@ -18,7 +18,7 @@
 
 
 at::Tensor ms_deform_attn_cuda_forward(
-    const at::Tensor &value, 
+    const at::Tensor &value,
     const at::Tensor &spatial_shapes,
     const at::Tensor &level_start_index,
     const at::Tensor &sampling_loc,
@@ -50,7 +50,7 @@ at::Tensor ms_deform_attn_cuda_forward(
     const int im2col_step_ = std::min(batch, im2col_step);
 
     AT_ASSERTM(batch % im2col_step_ == 0, "batch(%d) must divide im2col_step(%d)", batch, im2col_step_);
-    
+
     auto output = at::zeros({batch, num_query, num_heads, channels}, value.options());
 
     const int batch_n = im2col_step_;
@@ -81,7 +81,7 @@ at::Tensor ms_deform_attn_cuda_forward(
 
 
 std::vector<at::Tensor> ms_deform_attn_cuda_backward(
-    const at::Tensor &value, 
+    const at::Tensor &value,
     const at::Tensor &spatial_shapes,
     const at::Tensor &level_start_index,
     const at::Tensor &sampling_loc,
@@ -127,7 +127,7 @@ std::vector<at::Tensor> ms_deform_attn_cuda_backward(
     auto per_sample_loc_size = num_query * num_heads * num_levels * num_point * 2;
     auto per_attn_weight_size = num_query * num_heads * num_levels * num_point;
     auto grad_output_n = grad_output.view({batch/im2col_step_, batch_n, num_query, num_heads, channels});
-    
+
     for (int n = 0; n < batch/im2col_step_; ++n)
     {
         auto grad_output_g = grad_output_n.select(0, n);
@@ -150,4 +150,4 @@ std::vector<at::Tensor> ms_deform_attn_cuda_backward(
     return {
         grad_value, grad_sampling_loc, grad_attn_weight
     };
-}
+}
diff --git a/.../detr/detr/src/cuda/ms_deform_attn_cuda.h → ...s/detr/detr/src/gpu/ms_deform_attn_cuda.h b/.../detr/detr/src/cuda/ms_deform_attn_cuda.h → ...s/detr/detr/src/gpu/ms_deform_attn_cuda.h