-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
General performance improvements #14
Conversation
Improves performance for getByAlias and sibling methods
# Conflicts: # lib/src/main/java/net/fellbaum/jemoji/EmojiManager.java
final int[] textCodePointsArray = stringToCodePoints(text); | ||
final long textCodePointsLength = textCodePointsArray.length; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a hot spot, there's a way to get the code point from the string, which would save us from getting the entire array (except in worst case of course), but i couldn't figure out a way to get the correct index
Might be interesting though: https://stackoverflow.com/a/1527891 and String#offsetByCodePoints
I really appreciate your work on this. Do you mind creating separate PR's for every change you do? |
I'm all in for upgrading, but the corporate world is how it is and I want to offer a replacement for the many unmaintained libraries. The ideal solution is to get the library into a solid state, and then if possible swap the EmojiManager (more recent java version) with the old java 8 version if they are running on a lower version that does not support the current version. The file would not get any updates but it still exists if someone is still running on 8. And there are still quite some applications that are running on 8 |
This reverts commit 2612965.
Replacing with smaller PRs |
Getting an emoji by its alias is now a lot faster, mainly due to avoiding an O(N) loop over all the emojis in the worst case (i.e., when the alias is invalid)
Granted this is only a 5 second warmup/iteration, on 5 warmups/iterations, on one fork:
Old code:
New code:
The alias maps are lazily created, as I believe most people would either not use it, or stick to one alias group.
Duplicate refactoring
Most of the code finding unicode emojis is duplicated in each method, each with small behavior differences, I moved the parsing logic away in a single method, calling an interface on certain points
Future improvements
Some parts of the library use
List#contains
when usingSet#contains
would be faster in most cases, I saw that Java 8 does not provide an easy, nor performant, way to transform an array into aSet
.Since Java 8 is already EOL since March 2022, I would strongly consider updating JEmoji to Java 17 (which also aligns with Javacord)
This would enable usages of
Set.of
(replacingArrays.asList
) andSet.copyOf
(replacing constructs such asCollections.unmodifiableList(new ArrayList<>(aliases))
)This would also change
List<Alias>
intoSet<Alias>
inEmoji