Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: Fix copy semantics in __array__ #60046

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

seberg
Copy link
Contributor

@seberg seberg commented Oct 15, 2024

This should fix the semantics of __array__. While rejecting copy=False is OK even if unnecessary, copy=True should never have been ignored and is dangerous.

Closes gh-57739, gh-59932

  • Still needs new tests.
  • Added an entry in the latest doc/source/whatsnew/vX.X.X.rst file if fixing a bug or adding a new feature.

I have to figure out the tests. I think a test in form of:

arr1 = np.asarray(obj, copy=True)
arr2 = np.asarray(obj, copy=True)
assert not np.may_share_memory(arr1, arr2)
# Check that without copy always works:
assert_array_equal(arr1, np.asarray(obj))

if np_ver < 2:
    return  # copy=False semantics not supported

try:
   arr1 = np.asarray(obj, copy=False)
except ValueError:
   return  # An error is acceptable for `copy=False`

# If no error is given, multiple returns must be views:
arr2 = np.asarray(obj, copy=False)
assert np.may_share_memory(arr1, arr2)

will work nicely. But, right now I am not sure if there is a convenient pattern/parametrization to steal to cover all of the __array__ implementations here.

This fixes the semantics of ``__array__``.  While rejecting ``copy=False``
is pretty harmless, ``copy=True`` should never have been ignored and is
dangerous.
ret = take_nd(self.categories._values, self._codes)
if dtype and np.dtype(dtype) != self.categories.dtype:
return np.asarray(ret, dtype)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did not understand why this is needed. If dtypes match, NumPy should make it a no-op? If dtype is None, it is the same as not passing.

@ajfriend
Copy link

Thanks for addressing this! Could this be released as a bug fix for v2.x?

@seberg
Copy link
Contributor Author

seberg commented Oct 21, 2024

FWIW, if anyone knows how to add decent tests for this, pushing to the PR is appreciated. Otherwise, I may try to figure out where to add at least tests for a few of these paths.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

COMPAT: Utilize copy keyword in __array__
2 participants