First stage in text speed improvements #5190

andydotxyz · 2024-10-12T10:14:30Z

It's such a big challenge that I am working on this in stages.

First: Move text cache index to the text details not object reference

Using Terminal app as benchmark (TextGrid gets the biggest boost with this change) when running "top" fullscreen:

around 20% less memory
up to 67% lower CPU usage
App resize feels faster too :)

Next improvements will build on this so other text widgets see bigger improvements.

Checklist:

Tests included. <- as long as existing tests all work this speedup is good :)
Lint and formatter run with no errors.
Tests all pass.

coveralls · 2024-10-12T10:20:40Z

coverage: 59.892% (-0.04%) from 59.934%
when pulling fe31121 on fix/textspeed1
into 1325f0d on develop.

Jacalz

The code looks good, just one inline comment. Also, is there any chance that you could add some benchmarks to more easily verify the performance improvements. I mean, yes, I do feel that it is faster but with performance improvements you really need solid benchmarks to base it on. Not only to make it easier to see that things got faster in this case but also to see improvements in upcoming changes plus making sure that we don't regress going forward.

Jacalz · 2024-10-13T11:51:18Z

internal/cache/texture_common.go

@@ -45,6 +65,10 @@ func RangeExpiredTexturesFor(canvas fyne.Canvas, f func(fyne.CanvasObject)) {
 // gl context to ensure textures are deleted from gl.
 func RangeTexturesFor(canvas fyne.Canvas, f func(fyne.CanvasObject)) {
 	textures.Range(func(key, value any) bool {
+		if _, ok := key.(FontCacheEntry); ok {
+			return true // TODO what?


Should this TODO be addressed before merge?

Updated - a doc change covered it I think.
If the API is needed for other purposes at some point then we may need to track all of the objects referencing a certain string but it's not needed yet and because this will evolve I'm leaving it minimal where possible.

andydotxyz · 2024-10-13T14:13:50Z

The code looks good, just one inline comment. Also, is there any chance that you could add some benchmarks to more easily verify the performance improvements. I mean, yes, I do feel that it is faster but with performance improvements you really need solid benchmarks to base it on. Not only to make it easier to see that things got faster in this case but also to see improvements in upcoming changes plus making sure that we don't regress going forward.

I don't know how to write benchmarks that go through the OpenGL texture cache usage... Won't we need a window to appear and run various redraw scenarios before closing again and reporting back?

dweymouth

I am happy to approve this without a benchmark test as I can see how this is a needed first step for the further text cache improvements (caching glyph vs string)

Jacalz · 2024-10-13T19:00:53Z

I don't know how to write benchmarks that go through the OpenGL texture cache usage... Won't we need a window to appear and run various redraw scenarios before closing again and reporting back?

I don't necessarily see a problem with opening a window and reporting back. We only run benchmarks locally (CI is too fluctuating due to running in containers) and it would be very helpful for all the reasons stated above.

I'm more or less of the opinion that optimisations without benchmarks aren't justified. From what I've seen of the upstream Go compiler development, they basically won't accept alleged performance improvements without a benchmark (aside from trivial changes that are about as much code cleanup and refactoring).

Taking the time to write some proper, reliable benchmarking infrastructure really should be step one in my opinion and ir will pay off in the long run. The system monitor just isn't reliable enough.

Jacalz · 2024-10-13T19:19:52Z

To be clear: I'm not necessarily saying that I will block this due to missing a benchmark but I think one should be added if possible.

As a next step, we really should take the time to implement better benchmarking infrastructure for this area if it isn't possible to do at the moment and then change our review policy to require benchmarks for any significant changes.

andydotxyz · 2024-10-14T16:00:55Z

I agree that benchmarking should be improved to make this sort of thing easier, but what is the point of benchmarks that are off by default? If we want them to verify the speed or improvements stay in place then it should at least be on some sort of nightly CI run or they may never be run again.

In this instance I had hoped that being so clearly 2* faster but in an area where benchmarking has not been established yet we might slip by and move on to the next step. I can look into now the CPU usage during rendering can be captured, but given we draw at a specific frame rate then this is not on the sort of "tight loop" that makes for easy benchmarking. We will have to do comparison of computer load and usages under some sort of simulated runtime I would think...?

Jacalz · 2024-10-16T09:36:16Z

I agree that benchmarking should be improved to make this sort of thing easier, but what is the point of benchmarks that are off by default?

Manual benchmarks are most useful when you are working on trying to improve performance in an area like this. They can be used for example when opening a PR like this to compare your changes against the previous state. When you have implemented a further improvement, you can run the benchmarks again to verify that it is indeed faster and see by how much. The people reviewing can also run them locally if they want and see if they see the same gains. With benchmark data about allocations you also quite easily tweak the code until you get less allocations. It basically makes the process of improving code performance faster as some of the benchmarking is done automatically for you.

However, I do agree that it would be useful to have some sort of automatic benchmarking online but we might need a better CI infrastructure set up if we want consistency. What we had previously was causing headaches as it failed randomly because of being too slow on CI. The best option is probably to have some sort of listing that tracks performance per commit over time so you can see general trends. I think the Go compiler and some other projects have something similar.

In this instance I had hoped that being so clearly 2* faster but in an area where benchmarking has not been established yet we might slip by and move on to the next step.

While perhaps not super related to this given the large shown performance improvement, my biggest problem with missing benchmarks is that it can be very hard to look at the code and then have it be obvious that it actually improves performance. We may or may not be blinded by the fact that we think the changes should improve performance but what we actually feel is placebo because of that. In this case, I really don't understand the code well enough to understand why it improves performance but that might partly just be the fact that I haven't worked with the cache code that much.

Going by system-monitor data (fluctuates a lot) and feel certainly shows a big improvement here and I don't deny that. I just feel like we have historically been neglecting proper benchmarking throughout the toolkit for a long time now. I strongly believe that this and many other performance improvements that are needed in the toolkit would have been much easier to implement with better tooling.

Jacalz

Approved but I really would have preferred benchmarks to be added in future PRs if possible.
We need to work on better benchmarking infrastructure in the toolkit going froward but that should obviously not block this PR.

andydotxyz · 2024-10-16T10:36:58Z

Thanks, agreed. I will try to sit down and understand how to improve benchmarking soon.

Move text cache index to the text details not object reference

f10935b

Jacalz reviewed Oct 13, 2024

View reviewed changes

Fix todo with a doc update

fe31121

dweymouth approved these changes Oct 13, 2024

View reviewed changes

Jacalz approved these changes Oct 16, 2024

View reviewed changes

andydotxyz merged commit 4875e35 into develop Oct 16, 2024
24 checks passed

andydotxyz deleted the fix/textspeed1 branch October 16, 2024 10:37

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

First stage in text speed improvements #5190

First stage in text speed improvements #5190

andydotxyz commented Oct 12, 2024

coveralls commented Oct 12, 2024 •

edited

Loading

Jacalz left a comment

Jacalz Oct 13, 2024

andydotxyz Oct 13, 2024

andydotxyz Oct 13, 2024

andydotxyz commented Oct 13, 2024

dweymouth left a comment

Jacalz commented Oct 13, 2024 •

edited

Loading

Jacalz commented Oct 13, 2024 •

edited

Loading

andydotxyz commented Oct 14, 2024

Jacalz commented Oct 16, 2024 •

edited

Loading

Jacalz left a comment •

edited

Loading

andydotxyz commented Oct 16, 2024

First stage in text speed improvements #5190

First stage in text speed improvements #5190

Conversation

andydotxyz commented Oct 12, 2024

Checklist:

coveralls commented Oct 12, 2024 • edited Loading

Jacalz left a comment

Choose a reason for hiding this comment

Jacalz Oct 13, 2024

Choose a reason for hiding this comment

andydotxyz Oct 13, 2024

Choose a reason for hiding this comment

andydotxyz Oct 13, 2024

Choose a reason for hiding this comment

andydotxyz commented Oct 13, 2024

dweymouth left a comment

Choose a reason for hiding this comment

Jacalz commented Oct 13, 2024 • edited Loading

Jacalz commented Oct 13, 2024 • edited Loading

andydotxyz commented Oct 14, 2024

Jacalz commented Oct 16, 2024 • edited Loading

Jacalz left a comment • edited Loading

Choose a reason for hiding this comment

andydotxyz commented Oct 16, 2024

coveralls commented Oct 12, 2024 •

edited

Loading

Jacalz commented Oct 13, 2024 •

edited

Loading

Jacalz commented Oct 13, 2024 •

edited

Loading

Jacalz commented Oct 16, 2024 •

edited

Loading

Jacalz left a comment •

edited

Loading