-
Notifications
You must be signed in to change notification settings - Fork 582
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[v24.1.x] rptest: upgrade kgo and fix its usage #23813
Open
vbotbuildovich
wants to merge
8
commits into
redpanda-data:v24.1.x
Choose a base branch
from
vbotbuildovich:backport-pr-23793-v24.1.x-559
base: v24.1.x
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
[v24.1.x] rptest: upgrade kgo and fix its usage #23813
vbotbuildovich
wants to merge
8
commits into
redpanda-data:v24.1.x
from
vbotbuildovich:backport-pr-23793-v24.1.x-559
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
(cherry picked from commit 6bc816e)
Avoid spamming the log if a persistent failure is detected. (cherry picked from commit 2203d2a)
This prevents incorrect assertions being performed on the status thread. Stopping a service is a hard stop and does not guarantee that latest status of the service was ingested. Instead, the user should call wait() first and optionally stop(). If we call stop() first, then wait() this hides a problem in DeleteRecordsTest where KGO crashes but the test never notices it because stop() just killed the service and the status thread. (cherry picked from commit 29f94d6)
wait() after stop() does not make sense because wait() is then short circuited. In a previous commit I made this sequence of calls illegal to prevent mistakes where the caller might have wrong beliefs about what wait() does. In this test it seems that it is never intended to wait for producer or consumer to finish their work as we are producing an infinite (very large number) amount of messages and the intent is just to generate load so stop() is likely what was intended. (cherry picked from commit c476f20)
This prevents incorrect assertions being performed on the status thread. Stopping a service is a hard stop and does not guarantee that latest status of the service was ingested. Instead, the user should call wait first and optionally stop(). This prevents one particular problem where KGO did crash but the test never noticed it because it did call stop() and wait() just does nothing afterwards because it is expected that the service is down afterwards. (cherry picked from commit 477ccb5)
Details in PR redpanda-data/kgo-verifier#55 (cherry picked from commit f6696db)
In immediately preceding commits I have fixed the order in which consumer is shut down in this test which led which did hide a crash in kgo verifier consumer. I have also fixed the crash in kgo. This revealed a problem within DeleteRecordsTest and specifically the assert on out_of_scope_invalid_reads. In this particular test, before launching kgo producer, we first call produce_until_segments which uses a kafka tools for producing data. These batches do not conform to what kgo-verifier does expect. So it is perfectly valid to have out_of_scope_invalid_reads. (cherry picked from commit 45bb8f3)
"optimistically" because it is not guaranteed that kgo verifier service did any work yet. Users should call wait() in most of the cases and wait on the service to finish consuming all all the data. (cherry picked from commit 34eac28)
non flaky failures in https://buildkite.com/redpanda/redpanda/builds/56663#01929753-f4e7-4616-87fa-7893837d528b:
non flaky failures in https://buildkite.com/redpanda/redpanda/builds/56663#01929752-0328-48ca-9d9a-c0ddd435fc9e:
|
Retry command for Build#56663please wait until all jobs are finished before running the slash command
|
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Backport of PR #23793