-
Notifications
You must be signed in to change notification settings - Fork 449
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GROBID Inconsistent Reference Detection in Custom PDFs: Format Guidelines Needed #1154
Comments
With "detect references" do you mean, detect reference callout (e.g. For the first case, there is generally not much training data in grobid (Fulltext model), but maybe it's easier if you show me some examples of your generated documents. |
GwptVMUJQT.pdf I would like to know the formatting rules I need to follow when creating a new article PDF so that GROBID can accurately detect citations. |
There are no "rules" to format a document so that Grobid recognise the references. It's more like, to make a document like a scientific article. Then, most important, the references don't match the text, so is normal that Grobid does not extract them correctly. I did adjust your document and now with some more consistency looks much better ;-) Although, the body look indeed like an abstract: |
What is the correct format for a PDF file that GROBID can detect references in? I create PDFs myself, and sometimes they work and sometimes they don’t. I’m not sure about the formatting rules. Can you please let me know?
The text was updated successfully, but these errors were encountered: