-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Apply ignore logic before symlink resolution #3846
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
hauntsaninja
force-pushed
the
better-symlink
branch
from
August 19, 2023 03:51
0ebe16c
to
8e829be
Compare
hauntsaninja
force-pushed
the
better-symlink
branch
from
August 19, 2023 03:54
02644eb
to
2a8804c
Compare
JelleZijlstra
approved these changes
Sep 7, 2023
This was referenced Oct 24, 2023
3 tasks
hauntsaninja
added a commit
to hauntsaninja/black
that referenced
this pull request
Feb 11, 2024
This relates to psf#4015, psf#4161 and the behaviour of os.getcwd() Black is a big user of pathlib and as such loves doing `.resolve()`, since for a long time it was the only good way of getting an absolute path in pathlib. However, this has two problems: The first minor problem is performance, e.g. in psf#3751 I (safely) got rid of a bunch of `.resolve()` which made Black 40% faster on cached runs. The second more important problem is that always resolving symlinks results in unintuitive exclusion behaviour. For instance, a gitignored symlink should never alter formatting of your actual code. This was reported by users a few times. In psf#3846, I improved the exclusion rule logic for symlinks in `gen_python_files` and everything was good. But `gen_python_files` isn't enough, there's also `get_sources`, which handles user specified paths directly (instead of files Black discovers). So in psf#4015, I made a very similar change to psf#3846 for `get_sources`, and this is where some problems began. The core issue was the line: ``` root_relative_path = path.absolute().relative_to(root).as_posix() ``` The first issue is that despite root being computed from user inputs, we call `.resolve()` while computing it (likely unecessarily). Which means that `path` may not actually be relative to `root`. So I started off this PR trying to fix that, when I ran into the second issue. Which is that `os.getcwd()` (as called by `os.path.abspath` or `Path.absolute` or `Path.cwd`) also often resolves symlinks! ``` >>> import os >>> os.environ.get("PWD") '/Users/shantanu/dev/black/symlink/bug' >>> os.getcwd() '/Users/shantanu/dev/black/actual/bug' ``` This also meant that the breakage often would not show up when input relative paths. This doesn't affect `gen_python_files` / psf#3846 because things are always absolute and known to be relative to `root`. Anyway, it looks like psf#4161 fixed the crash by just swallowing the error and ignoring the file. Instead, we should just try to compute the actual relative path. I think this PR should be quite safe, but we could also consider reverting some of the previous changes; the associated issues weren't too popular. At the same time, I think there's still behaviour that can be improved and I kind of want to make larger changes, but maybe I'll save that for if we do something like psf#3952 Hopefully fixes psf#4205, fixes psf#4209, actual fix for psf#4077
hauntsaninja
added a commit
to hauntsaninja/black
that referenced
this pull request
Feb 11, 2024
This relates to psf#4015, psf#4161 and the behaviour of os.getcwd() Black is a big user of pathlib and as such loves doing `.resolve()`, since for a long time it was the only good way of getting an absolute path in pathlib. However, this has two problems: The first minor problem is performance, e.g. in psf#3751 I (safely) got rid of a bunch of `.resolve()` which made Black 40% faster on cached runs. The second more important problem is that always resolving symlinks results in unintuitive exclusion behaviour. For instance, a gitignored symlink should never alter formatting of your actual code. This kind of thing was reported by users a few times. In psf#3846, I improved the exclusion rule logic for symlinks in `gen_python_files` and everything was good. But `gen_python_files` isn't enough, there's also `get_sources`, which handles user specified paths directly (instead of files Black discovers). So in psf#4015, I made a very similar change to psf#3846 for `get_sources`, and this is where some problems began. The core issue was the line: ``` root_relative_path = path.absolute().relative_to(root).as_posix() ``` The first issue is that despite root being computed from user inputs, we call `.resolve()` while computing it (likely unecessarily). Which means that `path` may not actually be relative to `root`. So I started off this PR trying to fix that, when I ran into the second issue. Which is that `os.getcwd()` (as called by `os.path.abspath` or `Path.absolute` or `Path.cwd`) also often resolves symlinks! ``` >>> import os >>> os.environ.get("PWD") '/Users/shantanu/dev/black/symlink/bug' >>> os.getcwd() '/Users/shantanu/dev/black/actual/bug' ``` This also meant that the breakage often would not show up when input relative paths. This doesn't affect `gen_python_files` / psf#3846 because things are always absolute and known to be relative to `root`. Anyway, it looks like psf#4161 fixed the crash by just swallowing the error and ignoring the file. Instead, we should just try to compute the actual relative path. I think this PR should be quite safe, but we could also consider reverting some of the previous changes; the associated issues weren't too popular. At the same time, I think there's still behaviour that can be improved and I kind of want to make larger changes, but maybe I'll save that for if we do something like psf#3952 Hopefully fixes psf#4205, fixes psf#4209, actual fix for psf#4077
hauntsaninja
added a commit
that referenced
this pull request
Feb 12, 2024
This relates to #4015, #4161 and the behaviour of os.getcwd() Black is a big user of pathlib and as such loves doing `.resolve()`, since for a long time it was the only good way of getting an absolute path in pathlib. However, this has two problems: The first minor problem is performance, e.g. in #3751 I (safely) got rid of a bunch of `.resolve()` which made Black 40% faster on cached runs. The second more important problem is that always resolving symlinks results in unintuitive exclusion behaviour. For instance, a gitignored symlink should never alter formatting of your actual code. This kind of thing was reported by users a few times. In #3846, I improved the exclusion rule logic for symlinks in `gen_python_files` and everything was good. But `gen_python_files` isn't enough, there's also `get_sources`, which handles user specified paths directly (instead of files Black discovers). So in #4015, I made a very similar change to #3846 for `get_sources`, and this is where some problems began. The core issue was the line: ``` root_relative_path = path.absolute().relative_to(root).as_posix() ``` The first issue is that despite root being computed from user inputs, we call `.resolve()` while computing it (likely unecessarily). Which means that `path` may not actually be relative to `root`. So I started off this PR trying to fix that, when I ran into the second issue. Which is that `os.getcwd()` (as called by `os.path.abspath` or `Path.absolute` or `Path.cwd`) also often resolves symlinks! ``` >>> import os >>> os.environ.get("PWD") '/Users/shantanu/dev/black/symlink/bug' >>> os.getcwd() '/Users/shantanu/dev/black/actual/bug' ``` This also meant that the breakage often would not show up when input relative paths. This doesn't affect `gen_python_files` / #3846 because things are always absolute and known to be relative to `root`. Anyway, it looks like #4161 fixed the crash by just swallowing the error and ignoring the file. Instead, we should just try to compute the actual relative path. I think this PR should be quite safe, but we could also consider reverting some of the previous changes; the associated issues weren't too popular. At the same time, I think there's still behaviour that can be improved and I kind of want to make larger changes, but maybe I'll save that for if we do something like #3952 Hopefully fixes #4205, fixes #4209, actual fix for #4077
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This means, for instance, that a gitignored symlink cannot affect your formatting. Fixes #3527, fixes #3826