Welcome to Amazon Insight! This project allows users to gain deeper insights into their favorite Amazon products by scraping reviews in real-time and leveraging advanced language models to answer questions about the product.
Amazon.Insight.Demo.mp4
Amazon Insight is a multi-service application designed to provide a seamless user experience for extracting and analyzing Amazon product reviews. The application is composed of four main services:
- Spring Boot Service: Hosts the user interface and handles user interactions.
- Flask Service: Manages the scraping of Amazon product reviews and auxiliary tasks.
- Ollama Service: Hosts the LLM model (currently using llama3:8b) for generating responses based on the scraped reviews.
- PostgreSQL Database: Stores the scraped reviews and other relevant data.
All services are containerized using Docker, ensuring consistent and reproducible environments across development and production.
- Real-Time Review Scraping: Users can drop an Amazon product URL, and the application will scrape reviews in real-time.
- Interactive Q&A: After scraping, users can ask questions about the product, and the application will generate responses using the LLM model.
- Customizable LLM Configuration: The LLM model can be customized using Ollama's configuration, allowing for flexibility in model selection and tuning.
The project is structured as follows:
The Spring Boot service hosts the user interface and handles user interactions. Key components include:
- Controllers: Manage HTTP requests and responses.
ChatController
ReviewController
- Services: Handle business logic and communication with other services.
ScraperService
OllamaClient
The Flask service manages the scraping of Amazon product reviews and auxiliary tasks. Key components include:
- Routes: Define API endpoints for scraping and other tasks.
scraper.py
- Services: Implement the scraping logic and other functionalities.
scraper.py
- Schemas: Define data validation and serialization schemas.
amazon_product.py
- Models: Define the database models for storing scraped data.
amazon.py
The Ollama service hosts the LLM model, which is used to generate responses based on the scraped reviews. The current model in use is llama3:8b, but this can be customized through Ollama's configuration.
The PostgreSQL database stores the scraped reviews and other relevant data. Initialization scripts and Docker health checks ensure reliable database connectivity and performance.
- Docker
- Docker Compose
-
Clone the repository:
git clone https://github.com/yourusername/amazon-insight.git cd amazon-insight
-
Create a
.env
file with the necessary environment variables. -
Build and start the services using Docker Compose:
docker-compose up --build
- Open your browser and navigate to
http://localhost:8080
. - Drop an Amazon product URL to start scraping reviews in real-time.
- Once the reviews are scraped, you will be redirected to a landing page where you can ask questions about the product.
- The application will generate responses using the LLM model based on the scraped reviews.
The LLM model can be customized using Ollama's configuration. To change the model, update the configuration in the ollama_models
directory and restart the services.
We welcome contributions! Please read our contributing guidelines for more information.
This project is licensed under the GPL 3.0 License. See the LICENSE
file for details.
Thank you for using Amazon Insight! We hope this project helps you gain valuable insights into your favorite Amazon products. If you have any questions or feedback, please feel free to open an issue or contact us.