Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce overhead of function call and deprecate rarely used utilities #1024

Draft
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

ricardoV94
Copy link
Member

@ricardoV94 ricardoV94 commented Oct 9, 2024

We have way too much function overhead call. This PR tries to remove some of it.

Closes #1026

I used this simple one Op function as a test:

import numpy as np
import pytensor
import pytensor.tensor as pt
from pytensor.tensor.random.type import random_generator_type

rng = np.random.default_rng(123)
def numpy_normal(rng):
    # We have an explicit expand dims in PyTensor
    return rng.normal(np.array([0]), np.array([1]), size=(100,)) 

print("Direct numpy call:")
%timeit numpy_normal(rng)

rng_pt = random_generator_type()
x = pt.random.normal(rng=rng_pt, size=(100,))
fn = pytensor.function([pytensor.In(rng_pt, mutable=True)], x)

print("Fn call without trust input:")
%timeit fn(rng)

fn.trust_input = True
print("Fn call with trust input: ")
%timeit fn(rng)

fn.input_storage[0].storage[0] = rng
print("VM call bypassing Fn:")
%timeit fn.vm()

Note the largest speedup comes from removing the extra checks in RandomVariable.perform. But you can see some minor speedup in the fn eval beyond what's observed by calling the vm directly.

Before the PR:

Direct numpy call:
14.1 μs ± 43.9 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)

Fn call without trust input:
41 μs ± 377 ns per loop (mean ± std. dev. of 7 runs, 10,000 loops each)

Fn call with trust input: 
39.1 μs ± 889 ns per loop (mean ± std. dev. of 7 runs, 10,000 loops each)

VM call bypassing Fn:
30.3 μs ± 885 ns per loop (mean ± std. dev. of 7 runs, 10,000 loops each)

After the PR:

Direct numpy call:
14.5 μs ± 435 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)

Fn call without trust input:
27.3 μs ± 788 ns per loop (mean ± std. dev. of 7 runs, 10,000 loops each)

Fn call with trust input: 
26.4 μs ± 458 ns per loop (mean ± std. dev. of 7 runs, 10,000 loops each)

VM call bypassing Fn:
20.4 μs ± 334 ns per loop (mean ± std. dev. of 7 runs, 10,000 loops each)

Some of the changes would be more visible in a function with more inputs (but still relatively fast).

It also deprecates a bunch of obscure options that should allow us to remove a few more conditionals in the future.

Related to #552


📚 Documentation preview 📚: https://pytensor--1024.org.readthedocs.build/en/1024/

@ricardoV94 ricardoV94 force-pushed the perf_enh branch 3 times, most recently from 4e0e6d5 to 25147b8 Compare October 10, 2024 08:17
@ricardoV94 ricardoV94 changed the title Improve overhead of function call Improve overhead of function call and deprecate rarely used utilities Oct 10, 2024
@@ -128,9 +128,6 @@ def fiter_variable(self, other):
" a symbolic placeholder."
)

def may_share_memory(a, b):
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removing this avoids having to check for aliasing in the first place, during the function call

# and can't be aliased.
if not (
isinstance(inp, In)
and inp.borrow
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it's mutable it must be borrow (this is asserted in In.__init__) so no need to check for both as before. Also the attribute always exists.

Comment on lines 413 to 417
if any(
inp.variable.type.is_super(other_inp.variable.type)
or other_inp.variable.type.is_super(inp.variable.type)
for other_inp in group
):
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is more careful than the check before, which assumed a vector(shape=None) and vector(shape=(1,)) could not be aliased as their types were different, but they could. New test condition checks for this.

@ricardoV94 ricardoV94 marked this pull request as ready for review October 10, 2024 10:40
Copy link

codecov bot commented Oct 10, 2024

Codecov Report

Attention: Patch coverage is 91.93548% with 5 lines in your changes missing coverage. Please review.

Project coverage is 81.89%. Comparing base (a377c22) to head (720ecbc).

Files with missing lines Patch % Lines
pytensor/compile/function/types.py 90.90% 3 Missing and 2 partials ⚠️
Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #1024      +/-   ##
==========================================
- Coverage   81.90%   81.89%   -0.01%     
==========================================
  Files         182      182              
  Lines       47879    47840      -39     
  Branches     8617     8608       -9     
==========================================
- Hits        39214    39180      -34     
+ Misses       6492     6487       -5     
  Partials     2173     2173              
Files with missing lines Coverage Δ
pytensor/gradient.py 77.57% <ø> (+0.07%) ⬆️
pytensor/graph/null_type.py 64.70% <ø> (+1.54%) ⬆️
pytensor/graph/op.py 88.08% <100.00%> (+0.06%) ⬆️
pytensor/graph/type.py 93.65% <100.00%> (-0.20%) ⬇️
pytensor/scalar/basic.py 80.52% <ø> (+0.01%) ⬆️
pytensor/tensor/random/op.py 93.51% <100.00%> (-0.21%) ⬇️
pytensor/tensor/type_other.py 74.41% <ø> (+0.26%) ⬆️
pytensor/compile/function/types.py 79.47% <90.90%> (-0.47%) ⬇️

@ricardoV94 ricardoV94 changed the title Improve overhead of function call and deprecate rarely used utilities Reduce overhead of function call and deprecate rarely used utilities Oct 10, 2024
tests/compile/function/test_types.py Outdated Show resolved Hide resolved
pytensor/compile/function/types.py Outdated Show resolved Hide resolved
pytensor/compile/function/types.py Outdated Show resolved Hide resolved
@ricardoV94 ricardoV94 force-pushed the perf_enh branch 2 times, most recently from 7a2d4aa to 2faae23 Compare October 17, 2024 12:38
@ricardoV94 ricardoV94 force-pushed the perf_enh branch 2 times, most recently from 952c554 to 23b4fe9 Compare October 22, 2024 12:35
@ricardoV94
Copy link
Member Author

Marking as a draft because I want to add a note on the docstrings of pytensor.In about mutable and alias

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Reconsider checking for input alias during function calls
2 participants