Skip to content

This project is a Document Retrieval application that utilizes Retrieval-Augmented Generation (RAG) techniques to enable users to interact with uploaded PDF documents. By leveraging a Large Language Model (LLM), users can ask questions about the content of the documents and receive accurate answers based on the information retrieved.

License

Notifications You must be signed in to change notification settings

aniketwdubey/chatpdf

Repository files navigation

ChatPDF

This project is a Document Retrieval application that utilizes Retrieval-Augmented Generation (RAG) techniques to enable users to interact with uploaded PDF documents. By leveraging a Large Language Model (LLM), users can ask questions about the content of the documents and receive accurate answers based on the information retrieved.

Features

  • PDF Upload: Users can upload PDF files for processing.
  • AI Interaction: Ask questions about the content of the uploaded PDFs.
  • Machine Learning Integration: Utilizes advanced machine learning models for document processing and question answering.

Technologies Used

  • Backend: FastAPI
  • Frontend: Streamlit
  • Machine Learning: Langchain, Hugging Face Transformers
  • Vector Store: FAISS for efficient similarity search

Installation

  1. Clone the repository:

    git clone https://github.com/yourusername/chatpdf.git
    cd chatpdf
  2. Create a virtual environment and activate it:

    python -m venv .venv
    source .venv/bin/activate  # On Windows use .venv\Scripts\activate
  3. Install the required packages:

    pip install -r requirements.txt

Usage

  1. Start the FastAPI server:

    uvicorn app.main:app --reload
  2. Open the Streamlit app in another terminal:

    streamlit run app/streamlit_app.py
  3. Navigate to http://localhost:8501 in your web browser to access the application.

API Endpoints

  • GET /: Returns a welcome message.

  • POST /upload_pdf/: Uploads a PDF file for processing.

    • Request: Multipart form data with the PDF file.
    • Response: Success message upon successful upload and processing.
  • POST /ask/: Asks a question about the uploaded PDF.

    • Request: JSON body with the question.
    • Response: The answer to the question based on the PDF content.

alt text alt text

Testing

  1. To run the tests, use:

    streamlit run app/streamlit_app.py

About

This project is a Document Retrieval application that utilizes Retrieval-Augmented Generation (RAG) techniques to enable users to interact with uploaded PDF documents. By leveraging a Large Language Model (LLM), users can ask questions about the content of the documents and receive accurate answers based on the information retrieved.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published