-
Notifications
You must be signed in to change notification settings - Fork 45
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cannot transverse from clean_top_node to clean_doc or doc #497
Milestone
Comments
Comment by monstrfolk Please see codelucas/newspaper#863 with a fix for this issue. |
AndyTheFactory
added
bug
Something isn't working
PR-verify
Has a PR, must be checked
labels
Oct 30, 2023
ensured that cleaned_doc and cleaned_top_node are on the same DOM Added a Article.text_clean property that returns the cleaned text of an article based on the clean_top_node. |
This was referenced Nov 5, 2023
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Issue by monstrfolk
Sun Dec 13 01:34:57 2020
Originally opened as codelucas/newspaper#862
Perhaps misunderstand the relationship from clean_top_node to clean_doc or doc, but cannot transverse from clean_top_node to clean_doc or doc.
For example, following will not work.
a = Article('https://somesite.com/some_article')
a.download()
a.parse()
print(a.clean_doc.getroottree().getpath(a.clean_top_node))
Expect to be able to print the path from clean_doc/doc to clean_top_node.
The text was updated successfully, but these errors were encountered: