Option for Receipt Chunking #450

zeeshanlakhani · 2023-11-22T01:52:28Z

Summary

As @chadkoh re-pointed out, should we need to chunk receipts, as outputs can be very large and may not be transmitted over the network.

Effect(s) to the Rescue?

As @matheus23 keenly noted, do we need this if we push toward using the state effect (with our block store underneath, e.g. #189, CA (content-addressed) I/O) for outputs of a certain maximum length? The answer is we probably don't need chunking then, at least not right away.

☝🏽 using the state effect opens up the possibility of runners providing the capability to act as a storage provider, i.e. keeping things in the provided block store for some n length of time for reuse, vs the general rule to clean up post workflow run (or eventually GC on maxed out failures). The other option is to use another effect, like HTTP post, or some upload to a trusted provider (stored as a CA) to make the content output available elsewhere for some amount of time.

On the most generalized level, we may want to do both, effect-driven vs receipt-only-driven, as the latter can actually be helpful for parallelization tasks. But, with @matheus23's point, this is more of an enhancement at the moment, while we begin work on #189.

Solution (if chunking)?

The initial idea I had was to incorporate monotonic sequence numbers inside a receipt (currently, not spec'ed), and a total number/count. A non-chunked receipt always starts at 0. Chunked ones update the sequence number.
Upon lookup of the instruction cid, if multiple receipts are read, then the output has to be stitched together (by sequence number) to be used as an input to another function. Essentially, HELLO TCP!

Components

Check if the receipt's out byte size is greater than a transmit maximum (not generally configurable). We'll probably only do this on byte buffers vs other types. If it's maxed out, generate multiple receipts covering the spliced points of the output, chunked into even sizes (as is possible, for reuse).
Upon lookup of an instruction CID, if multiple receipts are read, stitch the output together by sequence number, if if we have the correct count/total for the sequence. The instruction CIDs and task/invocation ran CIDs should be the same for all these receipts. The receipt CIDS will be different, respectively.
This will affect Data: DB and Blockstore GC / LRU Pruning #264 as well, as we should prune the entire sequence group.

The text was updated successfully, but these errors were encountered:

expede · 2023-11-24T01:00:59Z

Just to move some conversation out here from an internal conversation: instead of producing a stream with a seq number, using a CID link for outputs over some threshold lets the receiptent tune how they stream in the output. Receipts should never be used for storage, as they can be GCed at any time (i.e. storage is not promised), and the outer shell of receipts should be kept as small as humanly possible so that they can be gossiped.

zeeshanlakhani · 2023-11-24T01:43:35Z

Going to close this on @expede's comment and open up one around receipt size specifically. Only thing to re-highlight is the key that storage is not promised in the blockstore or receipt context. Storage is essentially a choice, but pruning/GC will happen at a default clip.

zeeshanlakhani changed the title ~~Receipt Chunking~~ Option for Receipt Chunking Nov 22, 2023

zeeshanlakhani added enhancement New feature or request workflows Related to IPVM workflows labels Nov 22, 2023

zeeshanlakhani closed this as completed Nov 24, 2023

zeeshanlakhani mentioned this issue Feb 8, 2024

Handle large outputs in receipts as CID'ed #562

Open

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Option for Receipt Chunking #450

Option for Receipt Chunking #450

zeeshanlakhani commented Nov 22, 2023 •

edited

Loading

expede commented Nov 24, 2023

zeeshanlakhani commented Nov 24, 2023

Option for Receipt Chunking #450

Option for Receipt Chunking #450

Comments

zeeshanlakhani commented Nov 22, 2023 • edited Loading

Summary

Effect(s) to the Rescue?

Solution (if chunking)?

Components

expede commented Nov 24, 2023

zeeshanlakhani commented Nov 24, 2023

zeeshanlakhani commented Nov 22, 2023 •

edited

Loading