Skip to content
This repository has been archived by the owner on Jun 14, 2020. It is now read-only.

Extracting data from structured information extracted from Wikipedia #31

Open
5hirish opened this issue Oct 14, 2018 · 5 comments
Open
Assignees
Labels
help wanted Stuck here, any help would be appreciated needs discussion Some discussion required

Comments

@5hirish
Copy link
Owner

5hirish commented Oct 14, 2018

Your Environment

$ python -m qas.sys_info
# To get the system info, past the output of the above command here
  • Question you were trying to ask:
@5hirish 5hirish added help wanted Stuck here, any help would be appreciated needs discussion Some discussion required labels Oct 14, 2018
@5hirish 5hirish self-assigned this Oct 14, 2018
@meghanabhange
Copy link

Has this issue been closed? I believe I can contribute effectively to this issue

@5hirish
Copy link
Owner Author

5hirish commented Oct 8, 2019

@meghanabhange Not yet this issue is not closed. I am just no longer getting time to maintain this project. But can help get started if you are interested.

@meghanabhange
Copy link

Okay great.
So, to get started, was there a specific measurable result that you had in mind they could be achieved by resolving this issue?

@5hirish
Copy link
Owner Author

5hirish commented Oct 10, 2019

So any given Wikipedia page has both structured and unstructured information. Consider this example question: 'How many career points does Sebastian Vettle has?' Now the answer to this is stored in a structured form (tabular form) and not in the text on Sebastian's Wikipedia page. Wiki: Sebastian Vettle. As far as I can remember this project can extract data from tables (horizontal/vertical) in a key-value format and store it in Elasticsearch. But it just doesn't understand how to query it.

@5hirish
Copy link
Owner Author

5hirish commented Oct 10, 2019

If you are up for the challenge. I suggest you create a separate module to query structured info as the unstructured one needs a lot of performance fixes and improvements too. I can help you with elasticsearch or you are blocked anywhere in terms of the codebase.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
help wanted Stuck here, any help would be appreciated needs discussion Some discussion required
Projects
None yet
Development

No branches or pull requests

2 participants