-
Notifications
You must be signed in to change notification settings - Fork 34
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DOC-287 Mountable TS topics #725
base: v-WIP/24.3
Are you sure you want to change the base?
Conversation
✅ Deploy Preview for redpanda-docs-preview ready!
To edit notification comments on pull requests, go to your Netlify site configuration. |
|
||
. Enable xref:manage:tiered-storage.adoc[Tiered Storage] for specific topics, or for the entire cluster (all topics). | ||
. xref:get-started:rpk-install.adoc[Install `rpk`], or ensure that you have access to the Admin API. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are there any limitations that users should know about? For example, number of topics you can include in a migration? Amount of time that a topic can "hibernate" in object storage until it is mounted (although maybe that's more to do with their object storage configuration)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not aware of any limitations like this. I've successfully ran it with as many topics and partitions as I had no problems to create.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@bashtanov I saw in the rpk PR that a topic must have at least 3 partitions for it to be mounted from TS to a cluster, does that apply anywhere in this doc as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I never enforced or even heard about this restriction
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I saw that in the RFC for migrations, no idea if it is actually true as the testing at the time I wrote that was against a version of migrations that wasn't actually working.
- `cut_over` | ||
- `finished` | ||
|
||
== Monitor progress |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When or why might users encounter errors? Can and should they retry?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can think of the following situations:
- attempting to mount a topic that does not exist in the cloud storage
- attempting to mount a topic that is already mounted to this or another cluster
- any failures, such as tiered storage availability problems or multiple redpanda nodes going down.
All operations are retried indefinitely, so it's really unlikely that cancelling and restarting a migration would help. If there is any underlying problem fixing it should help without restarting.
ad98b60
to
99d3851
Compare
} | ||
``` | ||
|
||
You may optionally include the topic namespace (`ns`). Default value: `kafka` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also it is the only one supported so far
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great. Just left one question but happy to approve.
|
||
|=== | ||
|
||
It is not currently possible to unmount a topic whose name matches multiple topics in the origin cluster. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mattschumpert I removed the "Troubleshoot" heading and whittled it down to just these few scenarios. From chatting with @bashtanov it sounds like there is additional work to do to enable the user to specify which topic or "incarnation" they want if they try to mount a topic with multiple matches on the name. And also some work to clarify log messages when running into issues. If there are other scenarios that have to be described here please let me know and we can get that into the next iteration of this doc.
Should we ELI5 what exactly is "topic mount vs. topic unmount"? Or do we assume that our users are already familiar with the subject? Something like: |
|
||
== Unmount a topic from a cluster to object storage | ||
|
||
When you unmount a topic, all incoming writes to the topic are blocked as Redpanda unmounts the topic from the cluster to object storage. Producers and consumers of the topic receive an error message indicating that the topic is no longer available. The unmounted topic is deleted in the source cluster. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"The unmounted topic is deleted in the source cluster."
When does that happen? After the command is successfully ran? Is the command async? If yes, can I track its progress? Is it possible that the command halts mid operation? If yes, what do I do recover?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also I believe we should add more emphasis to this part as it's super important. Suggest bold, but INFO admonition could also work
-- | ||
====== | ||
|
||
You cannot cancel mount and unmount operations in the following <<monitor-progress,states>>: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How do I check those states?
Monitor should be before cancel, no? We talk about the states before introducing them
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See line 142 below--isn't that what you are asking about here @Deflaimun ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Deflaimun I think in a previous commit the monitor section was actually before cancel. I'll change it back.
| State | Unmount operation (outbound) | Mount operation (inbound) | ||
|
||
| `planned` | ||
2+| Redpanda validates the operation definition. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@@ -1,8 +1,8 @@ | |||
For topics with Tiered Storage enabled, you can mount and unmount topics to transfer the topic data between your cluster and object storage. This allows you to free up and reclaim unused partition space, or migrate a topic to a different cluster and hibernate or decommission the topic or even the entire cluster. | |||
For topics with Tiered Storage enabled, you can unmount a topic to detach segment data that is still on disk to object storage, and unmount a topic from object storage to attach the topic data to either the same origin cluster, or a different one. This allows you to hibernate a topic and free up and reclaim system resources taken up by the topic, or migrate a topic to a different cluster. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This diff made the page more tech accurate but also harder to understand. Consider simplifying. Also see #725 (comment) . Is unmount similar to just uploading the topic+metadata to Object Storage?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems odd that it is called Mountable, but most of the content here is about unmounting.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
free up/reclaim system resources are synonymous. Keep one or the other, but not both
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Feediver1 That's my mistake, I'll change it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Deflaimun (@mattschumpert to correct me if this is wrong) It's detaching the topic. So most of topic data would have already been uploaded to Tiered Storage (based on how quickly users have configured segment data to be moved to the cloud). Unmounting takes what is still on segment in the disk, moves it to the cloud (and stops reads and writes) so that it's all ready to "attach" again to a cluster.
@@ -0,0 +1,211 @@ | |||
For topics with Tiered Storage enabled, you can unmount a topic to detach segment data that is still on disk to object storage, and unmount a topic from object storage to attach the topic data to either the same origin cluster, or a different one. This allows you to hibernate a topic and free up and reclaim system resources taken up by the topic, or migrate a topic to a different cluster. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For topics with Tiered Storage enabled, you can unmount a topic to detach segment data that is still on disk to object storage, and unmount a topic from object storage to attach the topic data to either the same origin cluster, or a different one. This allows you to hibernate a topic and free up and reclaim system resources taken up by the topic, or migrate a topic to a different cluster. | |
For topics with Tiered Storage enabled, you can unmount a topic to detach segment data that is still on disk to object storage, and unmount a topic from object storage to attach the topic data to either the same origin cluster, or a different one. This allows you to hibernate a topic and free up or reclaim system resources taken up by the topic, or migrate a topic to a different cluster. |
|
||
Redpanda also transfers topic definitions when mounting or unmounting a topic. | ||
Redpanda also transfers topic manifests when mounting or unmounting a topic, so the topic can quickly accept reads and writes again and you can resume cluster workloads with ease. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wait. If the topic is deleted from the source cluster after unmounting, how does it accept read and writes again?
Maybe this needs to be expanded.
Redpanda also transfers topic manifests when mounting or unmounting a topic, so the topic can quickly accept reads and writes again and you can resume cluster workloads with ease. | |
Redpanda also transfers topic manifests when mounting or unmounting a topic, making possible to quickly resume operations after re-mounting. |
(or something to that effect)
| The topic data in object storage is no longer available to mount to any clusters. | ||
|
||
| `finished` | ||
| The operation is complete and then deleted. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| The operation is complete and then deleted. | |
| The operation is complete and deleted. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what is deleted? the operation itself or the topic?
if topic, consider this suggestion.
| The operation is complete and then deleted. | |
| The operation is complete and topic is deleted from the source cluster. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Deflaimun The underlying migration is deleted. I think the topic itself would have already been deleted in the prior state. @bashtanov can you confirm?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The operation being deleted seems weird to me. If deleted then I can't query about it anymore, thus the state will never be accessible. What if I want to audit the operation? Can the operation be offloaded to a log or something? @bashtanov
|
||
|=== | ||
|
||
It is not currently possible to unmount a topic whose name matches multiple topics in the origin cluster. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is not currently possible to unmount a topic whose name matches multiple topics in the origin cluster. | |
It is not possible to unmount a topic whose name matches multiple topics in the origin cluster. |
Co-authored-by: Joyce Fee <[email protected]>
@@ -11,7 +9,11 @@ An unmounted topic in object storage is detached from all clusters. The original | |||
|
|||
== Unmount a topic from a cluster to object storage | |||
|
|||
When you unmount a topic, all incoming writes to the topic are blocked as Redpanda unmounts the topic from the cluster to object storage. Producers and consumers of the topic receive an error message indicating that the topic is no longer available. The unmounted topic is deleted in the source cluster. | |||
When you unmount a topic, all incoming writes to the topic are blocked as Redpanda unmounts the topic from the cluster to object storage. Producers and consumers of the topic receive an error message indicating that the topic is no longer available. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the expected error message? How do I differentiate the error between an unmounted topic and one that never existed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@bashtanov just to check if you have those expected messages available.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, it is failed to download manifest for topic
and it is logged with warning severity
@@ -0,0 +1,213 @@ | |||
For topics with Tiered Storage enabled, you can unmount a topic to detach segment data that is still on disk to object storage, and mount that topic to either the same origin cluster, or a different one. This allows you to hibernate a topic and free up system resources taken up by the topic, or migrate a topic to a different cluster. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
'detach segment data' is implementation detail.
Something like:
'you can unmount a topic to safely detach it from a cluster while keeping the topic's data in the cluster's cloud storage bucket/container'
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Still some unaddressed comments
@@ -0,0 +1,213 @@ | |||
For topics with Tiered Storage enabled, you can unmount a topic to detach segment data that is still on disk to object storage, and mount that topic to either the same origin cluster, or a different one. This allows you to hibernate a topic and free up system resources taken up by the topic, or migrate a topic to a different cluster. | |||
|
|||
Redpanda also transfers topic manifests when mounting or unmounting a topic, making it possible to quickly resume operations after mounting to the destination cluster. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know that this is a useful statement to a user @kbatuigas . Nobody knows what topic manifests are and how they affect the user experience. No one ever interacts with a manifest directly. To them they are just mounting and unmounting a topic. Am I missing something @nvartolomei ?
Also, transferring is misleading. for TS we already have manifests in the bucket. This is a 'detach/reattach' operation.
I think we can just remove this statement.
|
||
== Additional considerations | ||
|
||
It is not possible to unmount a topic whose name matches multiple topics in the origin cluster. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand the point of this. There is no possibility for this ever to happen whatsoever. By definition the origin cluster cannot have duplicate topic names in the first place so there is no potential for this to be a problem. cc @nvartolomei
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mattschumpert That's my mistake, Matt. I meant to describe a scenario that I believe will be handled by this change so I'll go ahead and remove this line.
"source_topic": {"ns": "kafka", "topic": "<source-topic-2-name>"} | ||
}, | ||
{ | ||
"source_topic": {"ns": "kafka", "topic": "source-topic-3-name"}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
angle brackets missing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
only minor suggestions
@@ -7,7 +7,7 @@ For topics with Tiered Storage enabled, you can unmount a topic to safely detach | |||
|
|||
== Unmount a topic from a cluster to object storage | |||
|
|||
When you unmount a topic, all incoming writes to the topic are blocked as Redpanda unmounts the topic from the cluster to object storage. Producers and consumers of the topic receive an error message indicating that the topic is no longer available. | |||
When you unmount a topic, all incoming writes to the topic are blocked as Redpanda unmounts the topic from the cluster to object storage. Producers and consumers of the topic receive a warning `Failed to download manifest for topic` indicating that the topic is no longer available. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you change to lower case f
in failed
please? I would imagine them using grep
, which is case-sensitive by default, to search for the line.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And change "indicating that the topic is no longer available" to "indicating that either the topic is unavailable or there are multiple topics under the specified name" or something like this. It will be clear from the rest of the message which one is the case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@bashtanov yes, thank you!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@bashtanov actually, would "multiple topics under the specified name" ever apply in the case of unmount?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I messed up everything. This error -- failed to download manifest for topic
-- is for mounting when the topic is not available or cannot be unambiguously defined. It will be in logs.
As for producing into a topic that is about to be unmounted, it is invalid_topic_exception
or resource_is_being_migrated
they will be getting. When fetching from a not-yet-ready topic it'll be invalid_topic_exception
as well. These will be in the protocol replies.
Description
Resolves https://github.com/redpanda-data/documentation-private/issues/2504
Related: Migrations API reference
Review deadline: 9 Oct.
Page previews
24.3 beta: Mountable Topics
Checks