Fix send buffer misalignment [No 2] #1332

akoshelev · 2024-10-02T23:09:17Z

Overview

This is the second attempt to mitigate send buffer misalignment. Previous one (#1307) didn't handle all the edge cases and was abandoned in favour of this PR.

What I believe makes this change work is the new requirement for active work to be a power of two. With this constraint, it is much easier to align the read size with it. Given that total_capacity = active * record_size, we can represent read_size as a multiple of record_size too:
read_size = X * record_size. If X is a power of two and active_work is a power of two, then they will always be aligned with each other.

For example, if active work is 16, read size is 10 bytes and record size is 3 bytes, then:

total_capacity = 16*3
read_size = X*3 (close to 10)
X = 2 (power of two that satisfies the requirement)

when picking up the read size, we are rounding down to avoid buffer overflows. In the example above, setting X=3 would make it closer to the desired read size, but it is greater than 10, so we pick 2 instead.

What is included in this change?

Send buffer now enforces the active work to be a power of two
Subsequently, Gateway interface now puts the same restriction on GatewayConfig
Fix for unit tests that use ZKP - batch sizes now need to be a power of two as well (after Make active_work match records_per_batch #1316 active work and batch size must be aligned now)

This is the second attempt to mitigate send buffer misalignment. Previous one (private-attribution#1307) didn't handle all the edge cases and was abandoned in favour of this PR. What I believe makes this change work is the new requirement for active work to be a power of two. With this constraint, it is much easier to align the read size with it. Given that `total_capacity = active * record_size`, we can represent `read_size` as a multiple of `record_size` too: `read_size = X * record_size`. If X is a power of two and active_work is a power of two, then they will always be aligned with each other. For example, if active work is 16, read size is 10 bytes and record size is 3 bytes, then: ``` total_capacity = 16*3 read_size = X*3 (close to 10) X = 2 (power of two that satisfies the requirement) ``` when picking up the read size, we are rounding down to avoid buffer overflows. In the example above, setting X=3 would make it closer to the desired read size, but it is greater than 10, so we pick 2 instead.

While working on changing the gateway and parameters, I ran into several issues where the power of two constraint was not enforced and breakages were hard to find. A better model for me is to gate the active work at the config level, prohibiting invalid constructions at the caller side.

akoshelev · 2024-10-02T23:09:36Z

I want to run some Draft tests on this

akoshelev · 2024-10-02T23:30:40Z

Size	Link	Status
1000	https://draft-mpc.vercel.app/query/view/lax-intro2024-10-02T2329	Ok
10000	https://draft-mpc.vercel.app/query/view/lacy-job2024-10-02T2330	Ok
98721	https://draft-mpc.vercel.app/query/view/mute-brew2024-10-02T2339	Ok
10000	https://draft-mpc.vercel.app/query/view/tiny-lift2024-10-02T2346	Ok
100000	https://draft-mpc.vercel.app/query/view/macho-widow2024-10-02T2355	OK
200000	https://draft-mpc.vercel.app/query/view/laid-tally2024-10-03T0012	(timeout)

The latter failure is expected as we don't have a separate runtime for queries. This will be addressed later

codecov · 2024-10-03T00:15:48Z

Codecov Report

Attention: Patch coverage is 97.90941% with 6 lines in your changes missing coverage. Please review.

Project coverage is 93.59%. Comparing base (27de807) to head (34838fe).
Report is 16 commits behind head on main.

Files with missing lines	Patch %	Lines
ipa-core/src/utils/power_of_two.rs	94.23%	3 Missing ⚠️
ipa-core/src/helpers/gateway/mod.rs	98.63%	2 Missing ⚠️
ipa-core/src/app.rs	0.00%	1 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #1332      +/-   ##
==========================================
+ Coverage   93.55%   93.59%   +0.03%     
==========================================
  Files         208      209       +1     
  Lines       33937    34265     +328     
==========================================
+ Hits        31750    32069     +319     
- Misses       2187     2196       +9

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

akoshelev · 2024-10-03T20:44:55Z

ipa-core/src/helpers/gateway/mod.rs

@@ -84,6 +85,10 @@ pub struct GatewayConfig {
    /// payload may not be exactly this, but it will be the closest multiple of record size to this
    /// number. For instance, having 14 bytes records and batch size of 4096 will result in
    /// 4088 bytes being sent in a batch.
+    ///
+    /// The actual size for read chunks may be bigger or smaller, depending on the record size
+    /// sent through each channel. Read size will be aligned with [`Self::active_work`] value to


Add a comment explaining what to do if we want to align read size perfectly with the target read size

I suggest rewriting this entire comment. It feels out of date.

line 85 has a typo "side" instead of "size".

line 86 refers to "batch size" but this comment is above "read_size".

I think the example is wrong, because 4088 bytes is not a power of two multiple of 14. If I'm not wrong, the best you can do is now 14 * 2^8 < 4096 (it's 3,584).

You are absolutely right. Just did that

benjaminsavage

Left a few comments.

benjaminsavage · 2024-10-04T04:41:13Z

ipa-core/src/helpers/gateway/mod.rs

@@ -84,6 +85,10 @@ pub struct GatewayConfig {
    /// payload may not be exactly this, but it will be the closest multiple of record size to this
    /// number. For instance, having 14 bytes records and batch size of 4096 will result in
    /// 4088 bytes being sent in a batch.
+    ///
+    /// The actual size for read chunks may be bigger or smaller, depending on the record size
+    /// sent through each channel. Read size will be aligned with [`Self::active_work`] value to


I suggest rewriting this entire comment. It feels out of date.

line 85 has a typo "side" instead of "size".

line 86 refers to "batch size" but this comment is above "read_size".

I think the example is wrong, because 4088 bytes is not a power of two multiple of 14. If I'm not wrong, the best you can do is now 14 * 2^8 < 4096 (it's 3,584).

benjaminsavage · 2024-10-04T04:47:06Z

ipa-core/src/helpers/gateway/send.rs

+            // That is not what we want here: if read_size / record_size is a power
+            // of two, then subsequent division will get us further away from desired target.


I am so confused by this comment. Didn't you just say you do want the multiplier to be a power of two? But here you're saying it shouldn't be?

yea that was my poor attempt to explain the algorithm, but I think bitshift approach makes it very straightforward to understand what is going on , so I removed this comment entirely

benjaminsavage · 2024-10-04T04:47:31Z

ipa-core/src/helpers/gateway/send.rs

+            // That is not what we want here: if read_size / record_size is a power
+            // of two, then subsequent division will get us further away from desired target.
+            // For example: if read_size / record_size = 4, then prev_power_of_two = 2.
+            // In such cases, we want to stay where we are, so we add +1 for that.


What do you mean "stay where we are"?

it is all gone now as logic to compute the previous power of two is simplified

benjaminsavage · 2024-10-04T04:51:36Z

ipa-core/src/helpers/gateway/send.rs

+            let prev_power_of_two =
+                (gateway_config.read_size.get() / record_size + 1).next_power_of_two() / 2;
+            std::cmp::max(1, prev_power_of_two)
+        };


This feels like a really complicated way to get the previous power of two. Isn't it easier to just zero-out all but the most significant bit?

yea that would be easier. Thanks for the feedback!

When using more than one thread, this test was failing if futures were scheduled out of order, because `Notify` couldn't wake up futures scheduled after `notify_all` call. Using barriers solves the issue

Using bitshift turns out to be much easier to understand

There is no reason to do that.

benjaminsavage · 2024-10-07T05:00:18Z

ipa-core/src/helpers/gateway/send.rs

-        Self {
+        // define read size as a multiplier of record size. The multiplier must be
+        // a power of two to align perfectly with total capacity. We don't want to exceed
+        // the target read size, so we pick a power of two <= read_size.


Something still doesn't make sense here. The power of two is not the same thing as read_size. We might pick 2^8 or something, but 8 isn't comparable with "read_size".

benjaminsavage · 2024-10-07T05:08:09Z

ipa-core/src/helpers/gateway/send.rs

+        let read_size_multiplier =
+            // this computes the highest power of 2 less than or equal to read_size / record_size.
+            // Note, that if record_size is greater than read_size, we round it to 1
+            1 << (std::cmp::max(1, usize::BITS - (gateway_config.read_size.get() / record_size).leading_zeros()) - 1);


I believe this statement is correct, but it's really hard to read. Can you split it up across a few lines for clarity?

let target_multiplier = gateway_config.read_size.get() / record_size; let num_bits = usize::BITS - target_multiplier.leading_zeros(); 1 << (std::cmp::max(1, num_bits) - 1);

benjaminsavage · 2024-10-07T05:09:39Z

ipa-core/src/helpers/gateway/send.rs

+        assert!(this.total_capacity.get() >= record_size * gateway_config.active.get());
+        assert_eq!(0, this.total_capacity.get() % this.read_size.get());


Little comments here (with link to issues) would be helpful for readers to understand the necessity of these conditions.

Follow up on feedback from #1332

akoshelev added 3 commits October 2, 2024 15:44

Fix batch size to be a power of two in tests

46087f9

akoshelev requested a review from andyleiserson October 2, 2024 23:09

akoshelev changed the title ~~Fix send buffer misalignment~~ Fix send buffer misalignment [No 2] Oct 2, 2024

Fix quicksort batch size and align it with a power of two

831986c

Fix oneshot bench

27f65d0

akoshelev commented Oct 3, 2024

View reviewed changes

Merge from main

32bf1bb

benjaminsavage approved these changes Oct 4, 2024

View reviewed changes

akoshelev added 2 commits October 4, 2024 13:50

Fix shuttle tests

52fd077

Fix flaky send_recv_randomized test

279c306

When using more than one thread, this test was failing if futures were scheduled out of order, because `Notify` couldn't wake up futures scheduled after `notify_all` call. Using barriers solves the issue

akoshelev force-pushed the send-buffer-another-attempt branch from 5dd1de2 to 279c306 Compare October 4, 2024 20:54

akoshelev added 4 commits October 4, 2024 14:59

Change how we compute the previous power of two.

5e22824

Using bitshift turns out to be much easier to understand

Change comment on read_size

4aeb686

Don't run send config tests under Shuttle

ac31cbb

There is no reason to do that.

Don't run non zero power of two tests under Shuttle

34838fe

There is no reason to do that.

benjaminsavage reviewed Oct 7, 2024

View reviewed changes

benjaminsavage approved these changes Oct 7, 2024

View reviewed changes

benjaminsavage merged commit 57951b0 into private-attribution:main Oct 7, 2024
12 checks passed

akoshelev added a commit to akoshelev/raw-ipa that referenced this pull request Oct 7, 2024

Follow up on feedback from private-attribution#1332

bdc5eb5

benjaminsavage added a commit that referenced this pull request Oct 8, 2024

Merge pull request #1337 from akoshelev/send-buffer-followup

b97a908

Follow up on feedback from #1332

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix send buffer misalignment [No 2] #1332

Fix send buffer misalignment [No 2] #1332

akoshelev commented Oct 2, 2024

akoshelev commented Oct 2, 2024

akoshelev commented Oct 2, 2024 •

edited

Loading

codecov bot commented Oct 3, 2024 •

edited

Loading

akoshelev Oct 3, 2024

benjaminsavage Oct 4, 2024

akoshelev Oct 4, 2024

benjaminsavage left a comment

benjaminsavage Oct 4, 2024

benjaminsavage Oct 4, 2024

akoshelev Oct 4, 2024

benjaminsavage Oct 4, 2024

akoshelev Oct 4, 2024

benjaminsavage Oct 4, 2024

akoshelev Oct 4, 2024

benjaminsavage Oct 7, 2024

benjaminsavage Oct 7, 2024

benjaminsavage Oct 7, 2024

		// That is not what we want here: if read_size / record_size is a power
		// of two, then subsequent division will get us further away from desired target.

		assert!(this.total_capacity.get() >= record_size * gateway_config.active.get());
		assert_eq!(0, this.total_capacity.get() % this.read_size.get());

Fix send buffer misalignment [No 2] #1332

Fix send buffer misalignment [No 2] #1332

Conversation

akoshelev commented Oct 2, 2024

Overview

What is included in this change?

akoshelev commented Oct 2, 2024

akoshelev commented Oct 2, 2024 • edited Loading

codecov bot commented Oct 3, 2024 • edited Loading

Codecov Report

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

benjaminsavage left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

akoshelev commented Oct 2, 2024 •

edited

Loading

codecov bot commented Oct 3, 2024 •

edited

Loading