Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Expose conv2d weights/biases as ops #14566

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

LPanosTT
Copy link
Contributor

@LPanosTT LPanosTT commented Nov 1, 2024

Tickets

#13219, #12793

Problem description

Modelling convolutions in MLIR is difficult when you need to execute conv2d once in order to get the proper inputs for the next iteration.

What's changed

Moving weight and bias preparation to dedicated ops. The logic that prepared the ops on host already exists and is functional. The issue is that it is done within conv2d. Most of the code changes is copy/pasted.

Checklist

  • Post commit CI passes
  • Blackhole Post commit (if applicable)
  • Model regression CI testing passes (if applicable)
  • Device performance regression CI testing passes (if applicable)
  • New/Existing tests provide coverage for changes

Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Clang-Tidy found issue(s) with the introduced code (1/1)

#include <sys/types.h>
#include <cstdint>

using namespace tt;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ google-build-using-namespace ⚠️
do not use namespace using-directives; use using-declarations instead

#include "ttnn/cpp/ttnn/operations/data_movement/reshape_view/reshape.hpp"
#include "ttnn/tensor/tensor.hpp"

using namespace tt;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ google-build-using-namespace ⚠️
do not use namespace using-directives; use using-declarations instead

@@ -104,8 +176,16 @@ def conv2d(
memory_config: ttnn.MemoryConfig = None, # memory config overrides by user
conv_op_cache={}, # basic conv object caching in python needed for intermediate refactoring. Not needed after full op refactoring in C++.
debug=False, # ignored
return_output_size=False,
return_prepared_device_weights=False,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why complicate the API like this?

If the values are computed then there's no additional cost to returning them.

)


def prepare_conv_bias(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't new ops use the new op framework?

If you use that, then bind_registered_operation should remove the need for this extra code.

@@ -30,6 +30,8 @@ def Conv1d(
conv_config: Conv1dConfig = None, # config overrides by user
conv_op_cache={}, # basic conv object caching in python needed for intermediate refactoring. Not needed after full op refactoring in C++.
debug=False,
return_output_length=False,
return_prepared_device_weights=False,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why complicate the API like this?

If the values are computed then there's no additional cost to returning them.

Same comment below for 2d.

Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Clang-Tidy found issue(s) with the introduced code (1/2)

conv_config.act_block_h_override = constants::TILE_HEIGHT;
}
}

template <typename T>
Result conv2d(
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ clang-diagnostic-error ⚠️
unknown type name Result


Result Conv2dOperation::invoke(
uint8_t queue_id,
template std::tuple<ttnn::Tensor, uint32_t, uint32_t, ttnn::Tensor, std::optional<ttnn::Tensor>> conv2d<Device>(
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ clang-diagnostic-error ⚠️
explicit template instantiation cannot have a definition; if this definition is meant to be an explicit specialization, add <> after the template keyword

Suggested change
template std::tuple<ttnn::Tensor, uint32_t, uint32_t, ttnn::Tensor, std::optional<ttnn::Tensor>> conv2d<Device>(
template<> std::tuple<ttnn::Tensor, uint32_t, uint32_t, ttnn::Tensor, std::optional<ttnn::Tensor>> conv2d<Device>(


Result Conv2dOperation::invoke(
uint8_t queue_id,
template std::tuple<ttnn::Tensor, uint32_t, uint32_t, ttnn::Tensor, std::optional<ttnn::Tensor>> conv2d<Device>(
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ clang-diagnostic-error ⚠️
no function template matches function template specialization conv2d


template <typename T>
std::pair<ttnn::Tensor, std::optional<ttnn::Tensor>> prepare_conv_weights_biases_and_move_to_device(const ttnn::Tensor& weight_tensor, std::optional<const ttnn::Tensor>& bias_tensor, uint32_t input_channels_alignment, DataType weights_bias_dtype, uint32_t weight_block_h_ntiles, uint32_t weight_block_w_ntiles, const sliding_window::ParallelConfig& parallel_config, T * device, uint32_t groups, uint32_t act_block_h_ntiles, uint32_t input_width);

template <typename T>
Result conv2d(
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ clang-diagnostic-error ⚠️
unknown type name Result

@@ -215,7 +45,6 @@
std::optional<const Conv2dConfig> conv_config_ = std::nullopt,
const std::optional<const MemoryConfig> memory_config = std::nullopt);


struct Conv2dOperation{
static Result invoke(
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ clang-diagnostic-error ⚠️
unknown type name Result


template <typename T>
std::pair<ttnn::Tensor, std::optional<ttnn::Tensor>> prepare_conv_weights_biases_and_move_to_device(const ttnn::Tensor& weight_tensor, std::optional<const ttnn::Tensor>& bias_tensor, uint32_t input_channels_alignment, DataType weights_bias_dtype, uint32_t weight_block_h_ntiles, uint32_t weight_block_w_ntiles, const sliding_window::ParallelConfig& parallel_config, T * device, uint32_t groups, uint32_t act_block_h_ntiles, uint32_t input_width);

template <typename T>
Result conv2d(
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ clang-diagnostic-error ⚠️
unknown type name Result

@@ -215,7 +45,6 @@
std::optional<const Conv2dConfig> conv_config_ = std::nullopt,
const std::optional<const MemoryConfig> memory_config = std::nullopt);


struct Conv2dOperation{
static Result invoke(
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ clang-diagnostic-error ⚠️
unknown type name Result


template <typename T>
std::pair<ttnn::Tensor, std::optional<ttnn::Tensor>> prepare_conv_weights_biases_and_move_to_device(const ttnn::Tensor& weight_tensor, std::optional<const ttnn::Tensor>& bias_tensor, uint32_t input_channels_alignment, DataType weights_bias_dtype, uint32_t weight_block_h_ntiles, uint32_t weight_block_w_ntiles, const sliding_window::ParallelConfig& parallel_config, T * device, uint32_t groups, uint32_t act_block_h_ntiles, uint32_t input_width);

template <typename T>
Result conv2d(
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ clang-diagnostic-error ⚠️
unknown type name Result

@@ -215,7 +45,6 @@
std::optional<const Conv2dConfig> conv_config_ = std::nullopt,
const std::optional<const MemoryConfig> memory_config = std::nullopt);


struct Conv2dOperation{
static Result invoke(
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ clang-diagnostic-error ⚠️
unknown type name Result

if (num_cores_nhw < compute_grid_size.x && out_nhw_ntiles > compute_grid_size.x) {
num_cores_nhw = find_closest_largest_divisor_with_num_padding(out_nhw_ntiles, compute_grid_size.x);
}
grid = num_cores_to_corerange_set(num_cores_nhw, compute_grid_size, true);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ clang-diagnostic-error ⚠️
use of undeclared identifier num_cores_to_corerange_set; did you mean num_cores_to_corerangeset?

Suggested change
grid = num_cores_to_corerange_set(num_cores_nhw, compute_grid_size, true);
grid = num_cores_to_corerangeset(num_cores_nhw, compute_grid_size, true);

Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Clang-Tidy found issue(s) with the introduced code (2/2)

} else if (shard_layout == TensorMemoryLayout::WIDTH_SHARDED) {
num_cores_nhw = 1;
uint32_t num_cores_c = find_closest_common_largest_divisor(out_c_ntiles, std::ceil((float)input_channels / effective_tile_width), max_num_cores);
grid = num_cores_to_corerange_set(num_cores_c, compute_grid_size, true);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ clang-diagnostic-error ⚠️
use of undeclared identifier num_cores_to_corerange_set; did you mean num_cores_to_corerangeset?

Suggested change
grid = num_cores_to_corerange_set(num_cores_c, compute_grid_size, true);
grid = num_cores_to_corerangeset(num_cores_c, compute_grid_size, true);

Added new weight and bias preparation ops, and new conv op that can only take pre-prepared weights.
Working conv test with pre-prepared weights

added return weight/output dims kwargs

Only auto-shard if shard_layout not specified

Pass input memory config to prepare functions

Organize utility functions into their own files
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants