Skip to content

Data sets used by the project of my master's degree in Computer Science ("Study and Research in Anti-Spam Systems").

License

Notifications You must be signed in to change notification settings

marcelovca90-unifei/anti-spam-weka-data

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

14 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

anti-spam-weka-data

Data sets used by the project of my master's degree in Computer Science ("Study and Research in Anti-Spam Systems").

Prerequisites:

Git + P7ZIP + JDK 8 / OpenJDK 8 + Apache Maven

1. Cloning and building the main project (anti-spam-weka-cli):

git clone https://github.com/marcelovca90/anti-spam-weka-cli ~/anti-spam-weka-cli/
cd ~/anti-spam-weka-cli/AntiSpamWekaCLI/ && mvn clean install

2. Cloning and extracting the data sets (anti-spam-weka-data):

git clone https://github.com/marcelovca90/anti-spam-weka-data ~/anti-spam-weka-data/
cd ~/anti-spam-weka-data/ && 7z x LING_SPAM.7z
cd ~/anti-spam-weka-data/ && 7z x SPAM_ASSASSIN.7z
cd ~/anti-spam-weka-data/ && cat TREC.7z.part-* > TREC.7z && 7z x TREC.7z -pPASSWORD
cd ~/anti-spam-weka-data/ && cat UNIFEI_CRUDE.7z.part-* > UNIFEI_CRUDE.7z && 7z x UNIFEI_CRUDE.7z -pPASSWORD
cd ~/anti-spam-weka-data/ && cat UNIFEI_DELTA_0.7z.part-* > UNIFEI_DELTA_0.7z && 7z x UNIFEI_DELTA_0.7z -pPASSWORD
cd ~/anti-spam-weka-data/ && cat UNIFEI_DELTA_3.7z.part-* > UNIFEI_DELTA_3.7z && 7z x UNIFEI_DELTA_3.7z -pPASSWORD
cd ~/anti-spam-weka-data/ && cat UNIFEI_DELTA_7.7z.part-* > UNIFEI_DELTA_7.7z && 7z x UNIFEI_DELTA_7.7z -pPASSWORD

3. [LINUX] Installing the high-performance linear algebra library (netlib-java):

sudo apt-get install libatlas3-base libopenblas-base
sudo update-alternatives --config libblas.so
sudo update-alternatives --config libblas.so.3
sudo update-alternatives --config liblapack.so
sudo update-alternatives --config liblapack.so.3

4. Configuring and performing training and testing

cd ~/anti-spam-weka-cli/AntiSpamWekaCLI/ && vi run.properties
java -Xms<INITIAL_HEAP_SIZE> -Xmx<MAXIMUM_HEAP_SIZE> -Xss<STACK_SIZE> -jar ./target/AntiSpamWekaCLI-0.0.1-SNAPSHOT-jar-with-dependencies.jar

About

Data sets used by the project of my master's degree in Computer Science ("Study and Research in Anti-Spam Systems").

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published