-
Notifications
You must be signed in to change notification settings - Fork 21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
"primary_location:source.source.type" == "type"? #291
Comments
Thanks for the details (and cross-checking with the OpenAlex team)! I'm having some trouble following the narrative though - is the stuff about Could you give us a small reprex isolating the problem? Then we can us use that as the basis for debugging. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
I am currently conducting citation analysis. I focus on “works” data obtained from OpenAlex. This dataset serves as the primary source for conducting data analysis and data mining, specifically aimed at understanding the publications and citation (mainly articles).
I primarily use the “host_organization” field to analyze our publishers and selected publishers along with their associated journals. This analysis helps us identify which journals are frequently cited and determine how many times individual journals or publishers have been cited over the years.
During the process, I filtered using "issn_l" containing value for year 2019 to 2023.
# Filter rows where issn_l is neither NA nor an empty string articles_cited <- works_cited_final[!is.na(works_cited_final$issn_l) & works_cited_final$issn_l != "", ]
As a result, ~20% of articles have "referenced_works" values as NA. For example, https://openalex.org/works/W2980882411
I met with OpenAlex staff and they said that some book chapters have ISSN too . They recommended using an additional filter to see "primary_location:source.source.type: " = "journal".
OpenAlex technical documentation has "Journal articles will have a primary_location.source.type of journal"
I checked my data pulled, which has 38 cols. One col has a name "type". Is this the mapping of the" primary_location:source.source.type: "? If not, how to get it?
Thank you very much,
The text was updated successfully, but these errors were encountered: