-
Notifications
You must be signed in to change notification settings - Fork 319
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[#4984] improvement(core, doris): Add the random distribution strategy #4985
base: main
Are you sure you want to change the base?
Conversation
EVEN, | ||
|
||
/** Distributes data randomly across partitions or table. */ | ||
RANDOM; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What are the differences between EVEN
and RANDOM
?
AFAIK, RANDOM
is a kind of implementation of EVEN
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Both aim to balance the distribution of data to optimize performance, "Random" emphasizes more on the randomness of the data, while "Even" focuses on maintaining the uniformity of the distribution.
They are slightly different.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@FANNG1 do you have any comments on this issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO, HASH
, RANGE
RANDOM
are the implemetation how we do the distribution, even
is the something like distribution result, both HASH
and RANDOM
are the implementation of EVEN
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So it's recommend to remove EVEN
? But I remember that @yuqi1129 has done research and there is a certain kind of table that uses EVEN
as a distribution name
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@yuqi1129 , do you remember which kind of table use even distribution, could it be replaced by round-robin or random?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jerryshao any thoughts on this point? #4991 depends on this one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You guys have better background on this, you can have a off-line discussion and negotiate out a solution.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could not reach an agreement till now, so I postponed it to 0.7.0 as it's not a bugfix.
@yuqi1129 this should be merged before 0.7.0. |
Got it. |
Postpone this to 0.8.0 as it's not a big need. |
What changes were proposed in this pull request?
Add the random strategy instead of
even
for DistributionWhy are the changes needed?
Doris support
random
distribution instead ofeven
.Fix: #4984
Does this PR introduce any user-facing change?
N/A.
How was this patch tested?
IT