GitHub - IDEA-NTHU-Taiwan/data_mining_lab_fall_2: Data Mining Lab Session 2 (Fall 2017)

Hello Everyone,

Here is the list of packages needed for our second Data Mining Lab Session.

Software:

Python 3 (coding will be done strictly using Python 3)
Anaconda Environment (recommended but not mandatory) (https://www.continuum.io/downloads)
Jupyter (http://jupyter.org/)
Google's word2vec (Download the file... warning! it is really huge)(https://drive.google.com/file/d/0B7XkCwpI5KDYNlNUTTlSS21pQmM/edit?usp=sharing)
Gensim (https://radimrehurek.com/gensim/)
Scikit Learn (http://scikit-learn.org/stable/) (get the latest version)
Pandas (http://pandas.pydata.org/)
Matplotlib (https://matplotlib.org/)
NLTK (for stopwords) (http://www.nltk.org/)

Computing Resources:

Operating System: Preferably Linux or MacOS (Windows break but you can try it out)
RAM: 4GB
Disk Space: 8GB (mostly to store word embeddings)

Test:

Once you have installed all the necessary packages, you can test to see if everything is working by running the following python code:

import logging
logging.root.handlers = []  # Jupyter messes up logging so needs a reset
logging.basicConfig(format='%(asctime)s : %(levelname)s : %(message)s', level=logging.INFO)
from smart_open import smart_open
import pandas as pd
import numpy as np
from numpy import random
import gensim
import nltk
from sklearn.cross_validation import train_test_split
from sklearn import linear_model
from sklearn.feature_extraction.text import CountVectorizer, TfidfVectorizer
from sklearn.metrics import accuracy_score, confusion_matrix
import matplotlib.pyplot as plt
from gensim.models import Word2Vec
from sklearn.neighbors import KNeighborsClassifier
from sklearn import linear_model
from nltk.corpus import stopwords
%matplotlib inline

If you have any further questions please feel free to contact me at [email protected]

Have Fun,

Elvis Saravia (Data Mining TA)

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
.ipynb_checkpoints		.ipynb_checkpoints
data		data
helpers		helpers
.gitignore		.gitignore
Data Mining Lab II (Fall 2017).pdf		Data Mining Lab II (Fall 2017).pdf
Emotion_Recognition_Word_Embeddings.ipynb		Emotion_Recognition_Word_Embeddings.ipynb
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Software:

Computing Resources:

Test:

About

Releases

Packages

Languages

IDEA-NTHU-Taiwan/data_mining_lab_fall_2

Folders and files

Latest commit

History

Repository files navigation

Software:

Computing Resources:

Test:

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages