DrFAQ

DrFAQ is a plug-and-play question answering chatbot that can be generally applied to any organiation's text corpora.
Designed and implemented a NLP Question Answering architecture using spaCy, huggingface’s BERT language model, ElasticSearch, Telegram Bot API, and hosted on Heroku.

News

4 Mar 2021 - Transfer learning of language models alongside evaluation study is currently in progress.
13 Dec 2019 - Implementation of 4-step question-answering methodology completed.

Objective

Given an organisation's corpus of documents, generate a chatbot to enable natural question-answering capabilities.

Methodology

When a question is asked, the following processes are performed:

FAQ Question Matching using spaCy's Similarity - /match
- From a given list of Frequently Asked Questions (FAQs), the chatbot detects similarity to the specified question and selects the best answer from the existing list.
NLP Question Answering using huggingface's BERT - /nlp
- If the question asked is dissimilar to any existing FAQs, perform question answering on the knowledge base and return a sufficiently confident answer.
Answer Search using ElasticSearch - /search
- If the answer is not sufficiently confident, perform a search on the document corpus and return the search results.
Human Intervention
- If the search results are still not relevant, prompt a human to add the question-answer pair to the existing list of specified FAQs, or speak to a human.

Research

Transfer learning of language models researched in a benchmark study shows that:
- If a large and clean QA dataset is available, RoBERTa is the best language model.
- If only a small and unclean generated QA dataset is available, MobileBERT is the best language model.
- If the QA dataset contains many 'Who' questions, RoBERTa should be considered.

Future Work

Release DrFAQ as a pip package.
Make an interactive demo available.
Integrate abstractive question-answering into the methodology.
Leverage databases and cloud services.

References

explosion/spaCy - Industrial-strength Natural Language Processing (NLP) with Python and Cython
huggingface/transformers - Transformers: State-of-the-art Natural Language Processing for TensorFlow 2.0 and Pytorch
elastic/elasticsearch-py - Official Python low-level client for Elasticsearch
python-telegram-bot/python-telegram-bot - Python Wrapper for Telegram Bots
google-research/bert - TensorFlow code and pre-trained models for BERT
BERT - Pre-training of Deep Bidirectional Transformers for Language Understanding

Name		Name	Last commit message	Last commit date
Latest commit History 104 Commits
.github		.github
.idea		.idea
bot		bot
chat		chat
experimental		experimental
log		log
match		match
nlp		nlp
search		search
.gitignore		.gitignore
Procfile		Procfile
README.md		README.md
runtime.txt		runtime.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

DrFAQ

News

Objective

Methodology

Research

Future Work

References

About

Releases

Sponsor this project

Packages

Contributors 2

Languages

jetnew/DrFAQ

Folders and files

Latest commit

History

Repository files navigation

DrFAQ

News

Objective

Methodology

Research

Future Work

References

About

Topics

Resources

Stars

Watchers

Forks

Releases

Sponsor this project

Packages 0

Contributors 2

Languages

Packages