Skip to content

Commit

Permalink
fix: add some docs for actually listing the data
Browse files Browse the repository at this point in the history
  • Loading branch information
Alan Shaw committed Dec 13, 2023
1 parent 6e001af commit bbe9393
Show file tree
Hide file tree
Showing 5 changed files with 94 additions and 103 deletions.
4 changes: 2 additions & 2 deletions src/pages/docs/concepts/content-addressing.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,9 +4,9 @@ web3.storage's decentralized file storage relies on _content addressing_ to find

## The basic problem

Consider what happens when you resolve a link like web3.storage/docs/concepts/content-addressing. First, your operating system queries a global shared key-value store, split into many domains — you may know this as the Domain Name System (DNS). The DNS returns an IP address that your network card can use to send HTTP requests over the network, where this site's naming conventions turn the key /concepts/content-addressing into a response payload.
Consider what happens when you resolve a link like `web3.storage/docs/concepts/content-addressing`. First, your operating system queries a global shared key-value store, split into many domains — you may know this as the Domain Name System (DNS). The DNS returns an IP address that your network card can use to send HTTP requests over the network, where this site's naming conventions turn the key `/docs/concepts/content-addressing` into a response payload.

The problem is, components of an address like web3.storage/docs/concepts/content-addressing are _mutable_, meaning they can change over time. In the context of the web, where _everything_ is mutable and dynamic, this is just the way it's always been. As a result, [link rot](https://en.wikipedia-on-ipfs.org/wiki/Link_rot) is just something we've all learned to live with.
The problem is, components of an address like `web3.storage/docs/concepts/content-addressing` are _mutable_, meaning they can change over time. In the context of the web, where _everything_ is mutable and dynamic, this is just the way it's always been. As a result, [link rot](https://en.wikipedia-on-ipfs.org/wiki/Link_rot) is just something we've all learned to live with.

## CIDs: Location-independent, globally unique keys

Expand Down
49 changes: 27 additions & 22 deletions src/pages/docs/concepts/ucans-and-web3storage.md
Original file line number Diff line number Diff line change
Expand Up @@ -26,17 +26,18 @@ The delegation from a Space to your Agent that w3up-client needs can be passed e

### Delegation to other actors

Just like Spaces can delegate permissions to Agents you own, you can also delegate permissions to other actors' Agents. One common application of this could be you delegating permission to upload to your Space to your users. Here's a code snippet demonstrating this from the Upload section:
Just like Spaces can delegate permissions to Agents you own, you can also delegate permissions to other actors' Agents. One common application of this could be you delegating permission to upload to your Space to your users. Here's a code snippet demonstrating this:

```javascript
**Backend**

```js
import { CarReader } from '@ipld/car'
import * as DID from '@ipld/dag-ucan/did'
import * as Delegation from '@ucanto/core/delegation'
import { importDAG } from '@ucanto/core/delegation'
import * as Signer from '@ucanto/principal/ed25519'
import * as Client from '@web3-storage/w3up-client'

async function backend(did: string) {
async function backend(did) {
// Load client with specific private key
const principal = Signer.parse(process.env.KEY)
const client = await Client.create({ principal })
Expand All @@ -49,10 +50,8 @@ async function backend(did: string) {
// Create a delegation for a specific DID
const audience = DID.parse(did)
const abilities = ['store/add', 'upload/add']
const expiration = Math.floor(Date.now() / 1000) + 60 * 60 * 24 // 24 hours from now
const delegation = await client.createDelegation(audience, abilities, {
expiration,
})
const expiration = Math.floor(Date.now() / 1000) + (60 * 60 * 24) // 24 hours from now
const delegation = await client.createDelegation(audience, abilities, { expiration })

// Serialize the delegation and send it to the client
const archive = await delegation.archive()
Expand All @@ -66,22 +65,34 @@ async function parseProof(data) {
for await (const block of reader.blocks()) {
blocks.push(block)
}
return importDAG(blocks)
return Delegation.importDAG(blocks)
}
```

When the `backend` function is called in the developer's backend:
- It's passed the DID of the user's Agent
- Backend client initializes with an Agent that has permission to the developer's Space
- It then generates a UCAN delegated to the user Agent DID passed in with only the `store/add` and `upload/add` abilities (to give the user ability to upload) and set to expire in 24 hours

**Frontend**

```js
import * as Delegation from '@ucanto/core/delegation'
import * as Client from '@web3-storage/w3up-client'

async function frontend() {
// Create a new client
const client = await Client.create()

// Fetch the delegation from the backend
const apiUrl = `/api/w3up-delegation/${client.agent().did()}` // backend method is exposed at this API URL
const apiUrl = `/api/w3up-delegation/${client.agent().did()}`
const response = await fetch(apiUrl)
const data = await response.arrayBuffer()

// Deserialize the delegation
const delegation = await Delegation.extract(new Uint8Array(data))
if (!delegation.ok) {
throw new Error('Failed to extract delegation')
throw new Error('Failed to extract delegation', { cause: delegation.error })
}

// Add proof that this agent has been delegated capabilities on the space
Expand All @@ -92,16 +103,10 @@ async function frontend() {
}
```

You can see the following flow:

- When `backend` function is called in the developer's backend:
- It's passed the DID of the user's Agent
- Backend client initializes with an Agent that has permission to the developer's Space
- It then generates a UCAN delegated to the user Agent DID passed in with only the `store/add` and `upload/add` abilities (to give the user ability to upload) and set to expire in 24 hours
- When `frontend` function is called in the user's environment:
- An Agent DID is created
- The `backend` function hosted at an API endpoint is called, passing in the Agent DID
- The client is set up with a UCAN delegating upload capabilities to the Agent
- It's now ready to upload!
When the `frontend` function is called in the user's environment:
- An Agent DID is created
- The `backend` function hosted at an API endpoint is called, passing in the Agent DID
- The client is set up with a UCAN delegating upload capabilities to the Agent
- It's now ready to upload!

However, there's other interesting possibilities - for instance, you could create an app where your users make Spaces and delegate permission to your app to read their uploads. Read the [Architecture options](/docs/concepts/architecture-options/) section to explore more.
22 changes: 0 additions & 22 deletions src/pages/docs/how-to/list.md

This file was deleted.

48 changes: 48 additions & 0 deletions src/pages/docs/how-to/list.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
import { Callout } from 'nextra/components'

# How to list files uploaded to web3.storage

In this how-to guide, you'll learn about the different ways that you can list the files that you've uploaded to web3.storage. Once you've stored some files using web3.storage, you'll want to see a list of what you've uploaded. There are two ways you can do this:

- Programmatically using the JS client or CLI
- Using the web3.storage console

## Using the JS client or CLI

You can also access a listing of your uploads from your code using the web3.storage client. In the example below, this guide walks through how to use the JavaScript client library to fetch a complete listing of all the data you've uploaded using web3.storage.

For instructions on how to set up your client instance or CLI, check out the [Upload](/docs/how-to/upload/) section.

Today, like other developer object storage solutions, there is no sorting or querying by timestamp to keep things scalable.

- Client: `client.capability.upload.list({ cursor: '', size: 25 })`
- CLI: `w3 ls`

In the client the listing is paginated. The result contains a `cursor` that can be used to continue listing uploads. Pass the `cursor` in the result as an _option_ to the next call to receive the next page of results. The `size` option allows you to change the number of items that are returned per page.

In the CLI, you can use the `--shards` option to print for each upload the list of shards (CAR CIDs) that the uploaded data is contained within. You can learn about the relationship between uploads and shards in the [Upload vs. Store](/docs/concepts/upload-vs-store/) section.

<Callout type="info">
The `w3 ls` command automatically pages through the listing and prints the results.
</Callout>

### Listing shards

Each upload is comprised of one or more shards. You can get a list of all shard CIDs in a Space, or look up what the shard CIDs are for an individual upload.

- Client: `client.capability.store.list({ cursor: '', size: 25 })`
- CLI: `w3 can store ls --cursor "" --size 25`

The listings are paginated. The result contains a `cursor` that can be used to continue listing uploads. Pass the `cursor` in the result as an _option_ to the next call to receive the next page of results. The `size` option allows you to change the number of items that are returned per page.

A list of shards for a given upload can be retrieved like this:

- Client: `client.capability.upload.get(contentCID)`

You can learn about the relationship between uploads and shards in the [Upload vs. Store](/docs/concepts/upload-vs-store/) section.

## Using the console web UI

You can see a list of everything you've uploaded to web3.storage in the [console](https://console.web3.storage) web app. If you don't need to work with this list programmatically, using the website may be a simpler choice.

This console provides a convenient overview of your stored data, including links to view your files in your browser via an [IPFS gateway](https://docs.ipfs.io/concepts/ipfs-gateway/) and information about how the data is being stored on the decentralized storage networks that web3.storage uses under the hood.
74 changes: 17 additions & 57 deletions src/pages/docs/how-to/remove.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -20,79 +20,39 @@ Note that there is a minimum 30 day retention period for uploaded data, and even

web3.storage tracks two different things for its users to support content addressing. These concepts were first introduced in the Upload section:

- Content CIDs: The CIDs used to reference and access uploads in the format generally useful to users (e.g., files, directories). These CIDs are usually prefixed by `bafy…` or `bafk…`.
- Shard CIDs: The CID of the serialized shards of data itself (CAR files) that are produced client-side, sent to web3.storage, and stored. These CIDs are prefixed by `bag…`.
- **Content CIDs**: The CIDs used to reference and access uploads in the format generally useful to users (e.g., files, directories). These CIDs are usually prefixed by `bafy…` or `bafk…`.
- **Shard CIDs**: The CID of the serialized shards of data itself (CAR files) that are produced client-side, sent to web3.storage, and stored. These CIDs are prefixed by `bag…`.

web3.storage tracks usage for payment (i.e., how much storage is utilized by a user) using the volume of data associated with shard CIDs. However, in general, most users will be interacting with content CIDs (this is how you fetch your data from the network), with shard CIDs more of an implementation detail (how data gets chunked, serialized into CAR files, and stored for uploads).

Fortunately, this shouldn't make things any more complicated - we go into more detail below, but in general, when you remove a content CID from your account, you'll want to remove the shard CIDs as well (e.g., in the client calling `client.remove(contentCID, { shards: true })`).

However, if you are a power user interacting with shard CIDs as well (e.g., using the client's `capability.store.*` or CLI's `w3 can store *` methods), then you need to be more cautious about removing shard CIDs from your account. (This is why the default for the client and CLI methods is for shards to be maintained after removing a content CID). You can read more about why you might want to interact with shard CIDs directly and the implications in the Upload vs. Store section.

## Using the client or CLI
## Using the JS client or CLI

If you followed the Upload section, you should already have your client or CLI set up with an Agent for your Space. From there, to remove a content CID from your account, you'll generally be using:

- Client: `client.remove(contentCID)`
- CLI: `w3 rm <contentCID>`

If you initially uploaded your content by using the recommended upload methods (e.g., used `Client.upload()` or `w3 up`) and didn't interact with CAR shards at all when uploading, we recommend removing the shard CIDs associated with the content CID from your account. Otherwise, you will still be paying for the data stored with web3.storage (as mentioned above). The easiest way to do that is to set the `shards` parameter as `True`:

- Client: `client.remove(contentCID, shards=True)`
- CLI: `w3 rm <contentCID> --shards` in the CLI

A full example of this is:

```javascript
import * as Client from '@web3-storage/w3up-client'
import * as Signer from '@ucanto/principal/ed25519' // Agents on Node should use Ed25519 keys

const principal = Signer.parse(process.env.KEY) // Agent private key
const client = await Client.create({ principal })

async function main () {
// Load client with specific private key
const principal = Signer.parse(process.env.KEY)
const client = await Client.create({ principal })

// Add proof that this agent has been delegated capabilities on the space
const proof = await parseProof(process.env.PROOF)
const space = await client.addSpace(proof)
await client.setCurrentSpace(space.did())

// remove content previously uploaded, including the underlying shards
await client.remove('bafybeidd2gyhagleh47qeg77xqndy2qy3yzn4vkxmk775bg2t5lpuy7pcu', { shards: true })
}

/** @param {string} data Base64 encoded CAR file */
async function parseProof (data) {
const blocks = []
const reader = await CarReader.fromBytes(Buffer.from(data, 'base64'))
for await (const block of reader.blocks()) {
blocks.push(block)
}
return importDAG(blocks)
}

```
If you initially uploaded your content by using the recommended upload methods (e.g., used `Client.upload()` or `w3 up`) and didn't interact with CAR shards at all when uploading, we recommend removing the shard CIDs associated with the content CID from your account. Otherwise, you will still be paying for the data stored with web3.storage (as mentioned above). The easiest way to do that is to set the `shards` option to `true`:

- Client: `client.remove(contentCID, { shards: true })`
- CLI: `w3 rm <contentCID> --shards`

## Removing content CIDs and shard CIDs separately

If you have managed your shard CIDs and upload CIDs separately (e.g., used `client.capability.store.add()` and `client.capability.upload.add()` in the client or `w3 can store add` and `w3 can upload add` in the CLI), you'll want to remove the upload CIDs and underlying shard CIDs separately as well. You can read more about why you might want to interact with shard CIDs directly and the implications in the Upload vs. Store section.
If you have managed your shard CIDs and upload CIDs separately (e.g., used `client.capability.store.add()` and `client.capability.upload.add()` in the client or `w3 can store add` and `w3 can upload add` in the CLI), you might want to remove the upload CIDs and underlying shard CIDs separately as well. You can read more about why you might want to interact with shard CIDs directly and the implications in the Upload vs. Store section.

To remove shard CIDs and upload CIDs separately, you'll generally do this by:

- Client:
- If you registered a content CID you want to remove using `client.capability.upload.add(contentCID)`
- (If you don't know which shard CIDs are associated with the content CID) Run `client.capability.upload.listShards(contentCID)`, which returns a list of shard CIDs
- Remove it using `client.capability.upload.remove(contentCID)`
- Remove the shard CIDs that you'd like to
- For each shard CID, ensure no other uploaded content CIDs share the same shard (otherwise, the other content CIDs will no longer be fetchable)
- Remove the shard CIDs one-by-one using `client.capability.store.remove(shardCID)`
- CLI:
- If you registered a content CID you want to remove using `w3 can upload add <contentCID>`
- (If you don't know which shard CIDs are associated with the content CID) Run `w3 can upload ls <contentCID> --shards`, which returns a list of shard CIDs
- Remove it using `w3 can upload rm <contentCID>`
- Remove the shard CIDs that you'd like to
- For each shard CID, ensure no other uploaded content CIDs share the same shard (otherwise, the other content CIDs will no longer be fetchable)
- Remove the shard CIDs one-by-one using `w3 can store rm <shardCID>`
1. Determine shards to remove (skip this step if you already know!):
- Client: `client.capability.upload.list(contentCID)`
- CLI: `w3 can upload ls <contentCID> --shards`
1. Remove the upload:
- Client: `client.capability.upload.remove(contentCID)`
- CLI: `w3 can upload rm <contentCID>`
1. Remove each of the shards (ensure first that no other content is using that shard!):
- Client: `client.capability.store.remove(shardCID)`
- CLI: `w3 can store rm <shardCID>`

0 comments on commit bbe9393

Please sign in to comment.