Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DOC-287 Mountable TS topics #725

Open
wants to merge 23 commits into
base: v-WIP/24.3
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 5 additions & 3 deletions antora.yml
Original file line number Diff line number Diff line change
@@ -1,6 +1,8 @@
name: ROOT
title: Self-Managed
version: 24.2
version: 24.3
display_version: '24.3 Beta'
prerelease: true
start_page: home:index.adoc
nav:
- modules/ROOT/nav.adoc
Expand All @@ -15,11 +17,11 @@ asciidoc:
# Fallback versions
# We try to fetch the latest from GitHub at build time
# --
full-version: 24.2.2
full-version: 24.3.1
latest-release-commit: '72ba3d3'
latest-operator-version: 'v2.2.0-24.2.2'
latest-redpanda-helm-chart-version: 5.8.3
redpanda-beta-version: 24.2.2-rc5
redpanda-beta-version: 24.3.1-rc1
# --
supported-kubernetes-version: 1.21
supported-helm-version: 3.10.0
Expand Down
2 changes: 1 addition & 1 deletion local-antora-playbook.yml
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@ content:
- url: .
branches: HEAD
- url: https://github.com/redpanda-data/docs
branches: [v/*, api, shared, site-search,'!v-end-of-life/*']
branches: [main,v/*, api, shared, site-search,'!v-end-of-life/*']
- url: https://github.com/redpanda-data/cloud-docs
branches: main
- url: https://github.com/redpanda-data/redpanda-labs
Expand Down
1 change: 1 addition & 0 deletions modules/ROOT/nav.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -153,6 +153,7 @@
** xref:manage:tiered-storage-linux/index.adoc[Tiered Storage]
*** xref:manage:tiered-storage.adoc[]
*** xref:manage:fast-commission-decommission.adoc[]
*** xref:manage:mountable-topics.adoc[]
*** xref:manage:remote-read-replicas.adoc[Remote Read Replicas]
*** xref:manage:topic-recovery.adoc[Topic Recovery]
*** xref:manage:whole-cluster-restore.adoc[Whole Cluster Restore]
Expand Down
7 changes: 7 additions & 0 deletions modules/manage/pages/mountable-topics.adoc
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
= Mountable Topics
:description: Safely attach and detach Tiered Storage topics to and from a Redpanda cluster.
:page-context-links: [{"name": "Linux", "to": "manage:mountable-topics.adoc" } ]
:page-categories: Management
:env-linux: true

include::manage:partial$mountable-topics.adoc[]
230 changes: 230 additions & 0 deletions modules/manage/partials/mountable-topics.adoc
Original file line number Diff line number Diff line change
@@ -0,0 +1,230 @@
For topics with Tiered Storage enabled, you can unmount a topic to safely detach it from a cluster and keep the topic data in the cluster's object storage bucket or container. You can mount the detached topic to either the same origin cluster, or a different one. This allows you to hibernate a topic and free up system resources taken up by the topic, or migrate a topic to a different cluster.

== Prerequisites

. xref:get-started:rpk-install.adoc[Install `rpk`], or ensure that you have access to the Admin API.
kbatuigas marked this conversation as resolved.
Show resolved Hide resolved

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not really true as in cloud the RPK command will not use the admin API but rather the cloud public API (these APIs are still being added but will be soon).

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will there be a page in cloud docs as well? There it will be RPK or the Cloud API

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mattschumpert Yes, we'll have a page in the cloud docs. We'll share a lot of the content between the two docs but in cloud it'll show cloud API commands.

. Enable xref:manage:tiered-storage.adoc[Tiered Storage] for specific topics, or for the entire cluster (all topics).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are there any limitations that users should know about? For example, number of topics you can include in a migration? Amount of time that a topic can "hibernate" in object storage until it is mounted (although maybe that's more to do with their object storage configuration)?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not aware of any limitations like this. I've successfully ran it with as many topics and partitions as I had no problems to create.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bashtanov I saw in the rpk PR that a topic must have at least 3 partitions for it to be mounted from TS to a cluster, does that apply anywhere in this doc as well?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I never enforced or even heard about this restriction

Copy link
Contributor

@gene-redpanda gene-redpanda Oct 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I saw that in the RFC for migrations, no idea if it is actually true as the testing at the time I wrote that was against a version of migrations that wasn't actually working.

== Unmount a topic from a cluster to object storage

When you unmount a topic, all incoming writes to the topic are blocked as Redpanda unmounts the topic from the cluster to object storage. Producers and consumers of the topic receive a warning `Failed to download manifest for topic` indicating that the topic is no longer available.
Copy link

@bashtanov bashtanov Oct 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you change to lower case f in failed please? I would imagine them using grep, which is case-sensitive by default, to search for the line.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And change "indicating that the topic is no longer available" to "indicating that either the topic is unavailable or there are multiple topics under the specified name" or something like this. It will be clear from the rest of the message which one is the case.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bashtanov yes, thank you!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bashtanov actually, would "multiple topics under the specified name" ever apply in the case of unmount?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I messed up everything. This error -- failed to download manifest for topic -- is for mounting when the topic is not available or cannot be unambiguously defined. It will be in logs.

As for producing into a topic that is about to be unmounted, it is invalid_topic_exception or resource_is_being_migrated they will be getting. When fetching from a not-yet-ready topic it'll be invalid_topic_exception as well. These will be in the protocol replies.


An unmounted topic in object storage is detached from all clusters. The original cluster releases ownership of the topic.

NOTE: The unmounted topic is deleted in the source cluster, but can be mounted back again from object storage.

[tabs]
======
rpk::
+
--
In your cluster, run this command to unmount a topic to object storage:

```
rpk cluster storage unmount <namespace>/<topic-name>
```
--
Admin API::
+
--
To unmount topics from a cluster using the Admin API, make a POST request to the `/v1/topics/unmount` endpoint. Specify the names of the desired topics in the request body:

```
curl -X POST http://localhost:9644/v1/topics/unmount -d {
"topics": [
{
"topic": "<topic-1-name>"
},
{
"topic": "<topic-2-name>"
},
{
"topic": "<topic-3-name>"
}
]
}
```

You may optionally include the topic namespace (`ns`). Only `kafka` is supported.
--
======

You can use the ID returned by the command to <<monitor-progress,monitor the progress>> of the unmount operation using `rpk` or the Admin API.

== Mount a topic to a cluster

[tabs]
======
rpk::
+
--
In your target cluster, run this command to mount a topic from object storage:

```
rpk cluster storage mount <source-topic-name>
```

You can also rename the topic as you mount it to the target cluster:

```
rpk cluster storage mount <namespace>/<source-topic-name> --to <namespace>/<new-topic-name>
```
--
Admin API::
+
--
To mount topics to a target cluster using the Admin API, make a POST request to the `/v1/topics/mount` endpoint. Specify the names of the topics in the request body:

```
curl -X POST http://localhost:9644/v1/topics/mount -d {
"topics": [
{
"source_topic": {"ns": "kafka", "topic": "<source-topic-1-name>"},
"alias": {"ns": "kafka", "topic": "<new-topic-1-name>"}
},
{
"source_topic": {"ns": "kafka", "topic": "<source-topic-2-name>"}
},
{
"source_topic": {"ns": "kafka", "topic": "<source-topic-3-name>"},
"alias": {"ns": "kafka", "topic": "<new-topic-3-name>"}
}
]
}
```

* `ns` is the topic namespace. This field is optional and only `kafka` is supported.
* To rename a topic in the target cluster, use the optional `alias` object in the request body. In the example, topics 1 and 3 are given new names in the target cluster, while topic 2 retains its original name.

--

======

You can use the ID returned by the command to <<monitor-progress,monitor the progress>> of the mount operation using `rpk` or the Admin API.

When the mount operation is complete, the target cluster handles produce and consume workloads for the topics.

== Monitor progress

[tabs]
======
rpk::
+
--
To list active mount and unmount operations, run the command:

```
rpk cluster storage list-mount
```
--

Admin API::
+
--
Issue a GET request to the `/migrations` endpoint to view the status of topic mount and unmount operations:

```
curl http://localhost:9644/v1/migrations
```
--
======

You can also retrieve the status of a specific operation by running the command:


[tabs]
======
rpk::
+
--
```
rpk cluster storage status-mount <migration-id>
```
--
Admin API::
+
--
```
curl http://localhost:9644/v1/migrations/<migration-id>
```
--
======

The response returns the IDs and state of existing mount and unmount operations ("migrations"):

|===
| State | Unmount operation (outbound) | Mount operation (inbound)

| `planned`
2+| Redpanda validates the mount or unmount operation definition.

| `preparing`
| Redpanda flushes topic data, including topic manifests, to the destination bucket or container in object storage.
| Redpanda recreates the topics in a disabled state in the target cluster. The cluster allocates partitions but does not add log segments yet. Topic metadata is populated from the topic manifests found in object storage.

| `prepared`
| The operation is ready to execute. In this state, the cluster still accepts client reads and writes for the topics.
| Topics exist in the cluster but clients do not yet have access to consume or produce.

| `executing`
| The cluster rejects client reads and writes for the topics. Redpanda uploads any remaining topic data that has not yet been copied to object storage. Uncommitted transactions involving the topic are aborted.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mmaslankaprv In the case of TS topics (where we are also just uploading data not yet uploaded), how do we differentiate this from the first step.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mattschumpert @mmaslankaprv Is this "remaining data" that got in between when the unmount was started, until this point?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kbatuigas yes, rather between preparing and here. Originally, when designed for quite unmount+mount, this was in order to minimize topic downtime.

| The target cluster checks that the topic to be mounted has not already been mounted in any cluster.

| `executed`
| All unmounted topic data from the cluster is available in object storage.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there meant to be a description of the mounted topic as well?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@asimms41 there's not a lot to describe. How about The target cluster has verified that the topic has not already been mounted. ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we consider data singular or plural?

| The target cluster has verified that the topic has not already been mounted.

| `cut_over`
| Redpanda deletes topic metadata from the cluster, and marks the data in object storage as available for mount operations.
| The topic data in object storage is no longer available to mount to any clusters.

| `finished`
| The operation is complete.
| The operation is complete. The target cluster starts to handle produce and consume workloads.

| `canceling`
2+| Redpanda is in the process of canceling the mount or unmount operation.

| `cancelled`
2+| The mount or unmount operation is cancelled.

|===

== Cancel a mount or unmount operation

You can cancel a topic mount or unmount by running the command:

[tabs]
======
rpk::
+
--
```
rpk cluster storage cancel-mount <migration-id>
```
--

Admin API::
+
--
```
curl -X POST http://localhost:9644/v1/<migration-id>/?action=cancel
kbatuigas marked this conversation as resolved.
Show resolved Hide resolved
```
--
======

`<migration-id>` is the unique identifier of the operation. Redpanda returns this ID when you start a mount or unmount. You can also retrieve the ID by listing <<monitor-progress,existing migrations>>.

kbatuigas marked this conversation as resolved.
Show resolved Hide resolved
You cannot cancel mount and unmount operations in the following <<monitor-progress,states>>:

- `planned` (but you may still xref:api:ROOT:admin-api.adoc#delete-/v1/migrations/-id-[delete] a planned mount or unmount)
- `cut_over`
- `finished`
- `canceling`
- `cancelled`

== Additional considerations

Redpanda prevents you from mounting the same topic to multiple clusters at once. This ensures that multiple clusters don't write to the same location in object storage and corrupt the topic.

If you attempt to mount a topic where the name matches a topic already in the target cluster, Redpanda fails the operation and emits a warning message in the logs.