This Project facilitates the conversion of Excel files to SQLite databases and provides functionality to execute SQL queries on the generated database. Additionally, it includes converting natural language text to SQL queries using state-of-the-art models. We have implemented two primary models, namely CodeLlama and SQLCoder, to perform this task.
excel_db.py: Python script for converting Excel data to a database. inventory.xlsx: Sample Excel file containing inventory data (input file). inventory.db: SQLite database file created by the script (output file). nsql.py:This Python script generates SQL queries based on user prompts and executes them on a SQLite database. It utilizes a pre-trained language model to convert natural language instructions into SQL queries.
-
CodeLlama
- CodeLlama-34b-Instruct-hf (with 34 billion parameter)
- CodeLlama-13b-Instruct-hf (with 13 billion parameter)
-
SQLCoder
- SQLCoder-34b-Alpha (with 34 billion parameter)
- SQLCoder2 (with 15 billion parameter)
- We could load 4-bit, 8-bit, and 16-bit quantization in SQLCoder2 model
- Whereas in CodeLlama model we could load 13 billion and 34 billion parameters.
However, while SQLCoder2 model can be quantized to 4-bit, 8-bit and 16 bit only 4-bit and 8-bit quantization operations are currently supported.
We seamlessly integrate SQLite 3, a lightweight relational database, offering efficient data storage and retrieval. Leveraging its simplicity and reliability, we ensure robust database operations for our system
-
Database 1
- Tables:
- inventory
- Tables:
-
Database 2
- Tables:
- actions
- alerts
- conditions
- operations
- Tables:
-
Database 3
- Tables:
- acknowledges
- alerts
- actions
- events
- media_type
- users
- Tables:
Environment Setup Create Virtual Environment: Set up a virtual environment using Python's built-in venv module.
python -m venv myenv
Activate Virtual Environment: Activate the virtual environment to isolate dependencies.
On Windows:
myenv\Scripts\activate
On macOS and Linux:
source myenv/bin/activate
Install Required Packages: Install the necessary packages using pip.
pip install -r requirements.txt
Prepare your data in an Excel file named inventory.xlsx (or any other desired name).
Execute the Python script excel_db.py. The script will read the Excel file, convert it to CSV format, and create an SQLite database file (inventory.db) with corresponding tables.
python excel_db.py
Check Database After execution, you will find the generated SQLite database file (inventory.db) containing the converted data. Execute SQL Queries To execute SQL queries on the generated database, run the script nsql.py.
python nsql.py
Follow the prompts to input your SQL query and interact with the database.
Note
- Ensure that the required Excel file (inventory.xlsx) is present in the directory before running the script. Additionally, review the generated SQLite database (inventory.db) to verify the converted data.
- Here we have used SQLCODER2 4-bit quantized version , which occupies GPU RAM of around 9GB . Consider other models and quantizations as per your GPU storage .
- We have considered one table in one sheet for faster inference , for multiple tables add details in further sheets of excel.
sqlite: (https://druid.apache.org/docs/latest/design/)
hugginface: (https://huggingface.co/docs/transformers/index)