Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

First stage in text speed improvements #5190

Merged
merged 2 commits into from
Oct 16, 2024
Merged

First stage in text speed improvements #5190

merged 2 commits into from
Oct 16, 2024

Conversation

andydotxyz
Copy link
Member

It's such a big challenge that I am working on this in stages.

First: Move text cache index to the text details not object reference

Using Terminal app as benchmark (TextGrid gets the biggest boost with this change) when running "top" fullscreen:

  • around 20% less memory
  • up to 67% lower CPU usage
  • App resize feels faster too :)

Next improvements will build on this so other text widgets see bigger improvements.

Checklist:

  • Tests included. <- as long as existing tests all work this speedup is good :)
  • Lint and formatter run with no errors.
  • Tests all pass.

@coveralls
Copy link

coveralls commented Oct 12, 2024

Coverage Status

coverage: 59.892% (-0.04%) from 59.934%
when pulling fe31121 on fix/textspeed1
into 1325f0d on develop.

Copy link
Member

@Jacalz Jacalz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code looks good, just one inline comment. Also, is there any chance that you could add some benchmarks to more easily verify the performance improvements. I mean, yes, I do feel that it is faster but with performance improvements you really need solid benchmarks to base it on. Not only to make it easier to see that things got faster in this case but also to see improvements in upcoming changes plus making sure that we don't regress going forward.

@@ -45,6 +65,10 @@ func RangeExpiredTexturesFor(canvas fyne.Canvas, f func(fyne.CanvasObject)) {
// gl context to ensure textures are deleted from gl.
func RangeTexturesFor(canvas fyne.Canvas, f func(fyne.CanvasObject)) {
textures.Range(func(key, value any) bool {
if _, ok := key.(FontCacheEntry); ok {
return true // TODO what?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this TODO be addressed before merge?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oops!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated - a doc change covered it I think.
If the API is needed for other purposes at some point then we may need to track all of the objects referencing a certain string but it's not needed yet and because this will evolve I'm leaving it minimal where possible.

@andydotxyz
Copy link
Member Author

The code looks good, just one inline comment. Also, is there any chance that you could add some benchmarks to more easily verify the performance improvements. I mean, yes, I do feel that it is faster but with performance improvements you really need solid benchmarks to base it on. Not only to make it easier to see that things got faster in this case but also to see improvements in upcoming changes plus making sure that we don't regress going forward.

I don't know how to write benchmarks that go through the OpenGL texture cache usage... Won't we need a window to appear and run various redraw scenarios before closing again and reporting back?

Copy link
Contributor

@dweymouth dweymouth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am happy to approve this without a benchmark test as I can see how this is a needed first step for the further text cache improvements (caching glyph vs string)

@Jacalz
Copy link
Member

Jacalz commented Oct 13, 2024

I don't know how to write benchmarks that go through the OpenGL texture cache usage... Won't we need a window to appear and run various redraw scenarios before closing again and reporting back?

I don't necessarily see a problem with opening a window and reporting back. We only run benchmarks locally (CI is too fluctuating due to running in containers) and it would be very helpful for all the reasons stated above.

I'm more or less of the opinion that optimisations without benchmarks aren't justified. From what I've seen of the upstream Go compiler development, they basically won't accept alleged performance improvements without a benchmark (aside from trivial changes that are about as much code cleanup and refactoring).

Taking the time to write some proper, reliable benchmarking infrastructure really should be step one in my opinion and ir will pay off in the long run. The system monitor just isn't reliable enough.

@Jacalz
Copy link
Member

Jacalz commented Oct 13, 2024

To be clear: I'm not necessarily saying that I will block this due to missing a benchmark but I think one should be added if possible.

As a next step, we really should take the time to implement better benchmarking infrastructure for this area if it isn't possible to do at the moment and then change our review policy to require benchmarks for any significant changes.

@andydotxyz
Copy link
Member Author

I agree that benchmarking should be improved to make this sort of thing easier, but what is the point of benchmarks that are off by default? If we want them to verify the speed or improvements stay in place then it should at least be on some sort of nightly CI run or they may never be run again.

In this instance I had hoped that being so clearly 2* faster but in an area where benchmarking has not been established yet we might slip by and move on to the next step. I can look into now the CPU usage during rendering can be captured, but given we draw at a specific frame rate then this is not on the sort of "tight loop" that makes for easy benchmarking. We will have to do comparison of computer load and usages under some sort of simulated runtime I would think...?

@Jacalz
Copy link
Member

Jacalz commented Oct 16, 2024

I agree that benchmarking should be improved to make this sort of thing easier, but what is the point of benchmarks that are off by default?

Manual benchmarks are most useful when you are working on trying to improve performance in an area like this. They can be used for example when opening a PR like this to compare your changes against the previous state. When you have implemented a further improvement, you can run the benchmarks again to verify that it is indeed faster and see by how much. The people reviewing can also run them locally if they want and see if they see the same gains. With benchmark data about allocations you also quite easily tweak the code until you get less allocations. It basically makes the process of improving code performance faster as some of the benchmarking is done automatically for you.

However, I do agree that it would be useful to have some sort of automatic benchmarking online but we might need a better CI infrastructure set up if we want consistency. What we had previously was causing headaches as it failed randomly because of being too slow on CI. The best option is probably to have some sort of listing that tracks performance per commit over time so you can see general trends. I think the Go compiler and some other projects have something similar.

In this instance I had hoped that being so clearly 2* faster but in an area where benchmarking has not been established yet we might slip by and move on to the next step.

While perhaps not super related to this given the large shown performance improvement, my biggest problem with missing benchmarks is that it can be very hard to look at the code and then have it be obvious that it actually improves performance. We may or may not be blinded by the fact that we think the changes should improve performance but what we actually feel is placebo because of that. In this case, I really don't understand the code well enough to understand why it improves performance but that might partly just be the fact that I haven't worked with the cache code that much.

Going by system-monitor data (fluctuates a lot) and feel certainly shows a big improvement here and I don't deny that. I just feel like we have historically been neglecting proper benchmarking throughout the toolkit for a long time now. I strongly believe that this and many other performance improvements that are needed in the toolkit would have been much easier to implement with better tooling.

Copy link
Member

@Jacalz Jacalz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approved but I really would have preferred benchmarks to be added in future PRs if possible.
We need to work on better benchmarking infrastructure in the toolkit going froward but that should obviously not block this PR.

@andydotxyz
Copy link
Member Author

Thanks, agreed. I will try to sit down and understand how to improve benchmarking soon.

@andydotxyz andydotxyz merged commit 4875e35 into develop Oct 16, 2024
24 checks passed
@andydotxyz andydotxyz deleted the fix/textspeed1 branch October 16, 2024 10:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants