Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix batch_operate for multi-node cluster #136

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

opti
Copy link
Contributor

@opti opti commented Oct 24, 2024

What

That PR fixes an issue of missing records when the client.batch_operate is used with multiple BatchRead on a multi-node setup.

Why

With 2+ nodes, batch_operate with more than one BatchRead record will only update records with results from the first node. Or it updates records with wrong results (from other records).

To demonstrate, the following existing specs will fail when run on a multi-node cluster, but they're OK on a single-node run:

Failures:

  1) Aerospike::Client#batch_operate #BatchRead returns bins specified with operations
     Failure/Error: expect(records[1].result_code).to eql(0)

       expected: 0
            got: -19

       (compared using eql?)
     # ./spec/aerospike/batch_operate_spec.rb:76:in `block (4 levels) in <top (required)>'

  2) Aerospike::Client#batch_operate #BatchRead TTL #BatchRead
     Failure/Error: expect(br2.result_code).to eql(Aerospike::ResultCode::OK)

       expected: 0
            got: -19

       (compared using eql?)
     # ./spec/aerospike/batch_operate_spec.rb:132:in `block (4 levels) in <top (required)>'

Finished in 12.33 seconds (files took 0.21945 seconds to load)
13 examples, 2 failures

Failed examples:

rspec ./spec/aerospike/batch_operate_spec.rb:64 # Aerospike::Client#batch_operate #BatchRead returns bins specified with operations
rspec ./spec/aerospike/batch_operate_spec.rb:113 # Aerospike::Client#batch_operate #BatchRead TTL #BatchRead

Notes

However, I'm not entirely sure the fix is correct, since I don't know what was the original purpose of having all records and batch.records available in BatchOperateCommand. I can see both are used to write the data offset.

@records.each do |record|
key = record.key
@data_offset += key.digest.length + 4 # 4 byte batch offset

batch.records.each_with_index do |record, index|

At the same time, BatchOperateCommand is created per each batch node:

batch_nodes.each do |batch|
threads << Thread.new do
command = yield batch.node, batch
execute_command(command)
end
end

So, feedback is appreciated.

But as it stands right now, batch_operate cannot be used to read multiple records in a multi-node setup. It might be also a good idea to run tests in multi-node setup in the github workflow.

@opti opti force-pushed the batch-read-multi-nodes-fix branch from 2878dbf to 8797d05 Compare October 24, 2024 01:53
@opti
Copy link
Contributor Author

opti commented Nov 6, 2024

@khaf could you please take a look at this one when you have a moment.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant