Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Images and interwiki #222

Open
wants to merge 3 commits into
base: main
Choose a base branch
from
Open

Conversation

bt2901
Copy link

@bt2901 bt2901 commented Oct 15, 2019

I'm aware of #194 and your reservations about it.

This code was needed for my project anyway, so I figured it would do no harm to PR it. Can we salvage something? I think at least tag.py makes sense to be merged upstream.

What's, in your opinion, the best way to do it? Some sort of context object or config?

@earwig
Copy link
Owner

earwig commented Oct 16, 2019

The tag.py change is quite a hack and makes me uncomfortable. What we might need here is some awareness that padding is required when stripping tags in certain cases. (As an example, you have the same problem if you try to strip "foo<br>bar", but we don't want to break "foo<i>bar</i>".) There is definitely a bug to fix here, but I need to think more carefully about how to solve it generally.

Does this just require a way to determine block vs. inline tags? Easier said than done: consider "foo<span>bar</span>" vs. "foo<span style="display: block;">bar</span>". Okay, so that's definitely an unlikely edge case. Still, I need to think about it. Something is arguably better than nothing—we can't be perfect unless we actually render the page.

Some sort of context object or config?

Yes, I think we'll want to build something like this. It can be set up automatically by the interfaces provided by pywikibot etc. so the user only needs to manage it themselves if they are using the parser directly. My main hesitation here is that once we build it, we need to commit to supporting it, and it will introduce a new maintenance burden that could be frustrating if we aren't careful. I would love if this could be pulled dynamically from the live wiki configuration somehow so we don't need to hardcode anything in the parser, and then locally cache it as appropriate. Either way, this feature would be entirely optional and would need a way to plug into interfaces from other libraries for querying the API.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants