Skip to content

Latest commit

 

History

History
32 lines (28 loc) · 838 Bytes

README.md

File metadata and controls

32 lines (28 loc) · 838 Bytes

Datasets

For authors classification

https://archive.ics.uci.edu/ml/datasets/Victorian+Era+Authorship+Attribution
In total it has 53678 rows and 50 different authors
We simplified the task to 2990 rows and 4 different authors

For imdb sentiment classification

https://www.kaggle.com/lakshmi25npathi/imdb-dataset-of-50k-movie-reviews
In total it has 49582 rows and 2 label: positive and negative
We simplified the task to 3000 rows

Independent models

2 separately trained models

Multitask model

One model with 2 heads: author classification and imdb sentiment

Results

Authors

  48.   0.   0.   0.   
1. 20. 0. 0.
1. 0. 4. 1.
2. 8. 0. 64.
Accuracy: 91%

Imdb

  70.   7.   
4. 69.
Accuracy: 93%