Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Memoize the normalize function #63

Merged
merged 1 commit into from
Dec 16, 2018

Conversation

MasonProtter
Copy link
Contributor

@MasonProtter MasonProtter commented Dec 16, 2018

This gives some nice speed improvements for (sub)terms which are normalized many times.

julia> using Rewrite

julia> t  = @term((1 - 2/2)/5);

julia> t2 = @term((1 - 2/2)/5 + 1);

julia> rs = rules();

julia> normalize((@term 1 + 1), rs);

julia> @time normalize(t, rs)
  1.820396 seconds (6.22 M allocations: 317.694 MiB, 3.41% gc time)
@term(0)

julia> @time normalize(t, rs)
  0.000007 seconds (6 allocations: 224 bytes)
@term(0)

julia> @time normalize(t2, rs)
  0.006557 seconds (86.77 k allocations: 8.048 MiB)
@term(1)

julia> @time normalize(t2, rs)
  0.000007 seconds (6 allocations: 224 bytes)
@term(1)

Let me know if I'm doing this PR wrong. I'm targeting your branch fix/speed instead of master since I think this falls under PR #60.

@HarrisonGrodin
Copy link
Owner

Hmm... definitely doesn't solve all problems (except for the case with exact-repeat terms), but seems like a good improvement.

For reference, speeds without this PR on my machine:

julia> @time normalize(t, rs)
  2.794503 seconds (5.83 M allocations: 298.442 MiB, 3.13% gc time)
@term(0)

julia> @time normalize(t, rs)
  0.007747 seconds (121.90 k allocations: 11.674 MiB)
@term(0)

julia> @time normalize(t2, rs)
  0.015232 seconds (158.36 k allocations: 15.167 MiB, 37.15% gc time)
@term(1)

julia> @time normalize(t2, rs)
  0.018949 seconds (158.36 k allocations: 15.167 MiB, 35.53% gc time)
@term(1)

@HarrisonGrodin HarrisonGrodin merged commit 0432d14 into HarrisonGrodin:fix/speed Dec 16, 2018
@MasonProtter
Copy link
Contributor Author

MasonProtter commented Dec 16, 2018

The real advantage is that repeated sub-terms get memoized. Notice that the time for t2 is about 10x faster with the memoization than without. This is because t is a subexpression of t2 so all the normalizations on that subexpression are essentially free.

This would play a big role if you have some big term that you want to normalize and only part of it is changing with each iterated normalization.

But yes, this is only part of the speed problem. Dynamic dispatch is the big thing to fix (I think).

@MasonProtter MasonProtter deleted the fix/speed branch December 17, 2018 22:36
@HarrisonGrodin
Copy link
Owner

@MasonProtter Looking back at this, I'm having second thoughts.

Although the goals of this PR and #60 are performance-oriented, it seems like it might make sense to finalize #60 with only minor/specialized optimizations and consider memoization in more detail under a separate PR.

Similarly, it seems that we may benefit from being more clever with memoization strategies (e.g. memoizing each rule separately, storing whether or not a rule applies to a broad class of patterns, etc.).

For these reasons, would you mind if I revert this PR and open a separate issue to explore memoization in more detail?

@MasonProtter
Copy link
Contributor Author

Yes, I was thinking about this as well and I agree.

HarrisonGrodin added a commit that referenced this pull request Jan 3, 2019
@HarrisonGrodin
Copy link
Owner

Reverted, with #68 for discussion. :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants