TOSRoberta is an advanced Terms of Service (ToS) analyzer powered by a fine-tuned RoBERTa-large model. It classifies clauses in ToS documents based on their fairness level, helping users quickly identify potentially unfair terms.
- 📊 Analyzes ToS documents and classifies clauses into three categories:
- ✅ Clearly Fair
⚠️ Potentially Unfair- ❌ Clearly Unfair
- 📁 Supports both PDF and text file uploads
- 💻 User-friendly web interface built with Streamlit
- 🧠 Powered by a fine-tuned RoBERTa-large model (CodeHima/Tos-Roberta)
Our Tos-Roberta model demonstrates strong performance on the task of ToS clause classification:
- Validation Accuracy: 89.64%
- Test Accuracy: 85.84%
Detailed performance metrics per epoch:
Epoch | Training Loss | Validation Loss | Accuracy | F1 Score | Precision | Recall |
---|---|---|---|---|---|---|
1 | 0.443500 | 0.398950 | 0.874699 | 0.858838 | 0.862516 | 0.874699 |
2 | 0.416400 | 0.438409 | 0.853012 | 0.847317 | 0.849916 | 0.853012 |
3 | 0.227700 | 0.505879 | 0.896386 | 0.893325 | 0.891521 | 0.896386 |
4 | 0.052600 | 0.667532 | 0.891566 | 0.893167 | 0.895115 | 0.891566 |
5 | 0.124200 | 0.747090 | 0.884337 | 0.887412 | 0.891807 | 0.884337 |
tos-analyzer/
│
├── app.py
├── requirements.txt
├── utils/
│ ├── __init__.py
│ ├── text_processing.py
│ └── model_utils.py
└── README.md
-
Clone the repository:
git clone https://github.com/HimanshuMohanty-Git24/TOSRoberta.git cd TOSRoberta
-
Install the required dependencies:
pip install -r requirements.txt
-
Run the Streamlit app:
streamlit run app.py
We used Weights & Biases for monitoring the training process. Here's a glimpse of our training metrics:
Contributions are welcome! Please feel free to submit a Pull Request.
This project is licensed under the MIT License - see the LICENSE file for details.
- Hugging Face for the Transformers library
- Streamlit for the easy-to-use web app framework
- Weights & Biases for experiment tracking
Himanshu Mohanty - CodingHima - [email protected]
Project Link: https://github.com/HimanshuMohanty-Git24/TOSRoberta
⭐️ If you find this project useful, please consider giving it a star!