MODIFIED Deep Audio Classification

This is a fork from "Finding the genre of a song with Deep Learning" (https://medium.com/@juliendespois/finding-the-genre-of-a-song-with-deep-learning-da8f59a61194#.yhemoyql0), aimed at experimenting with more informative formats of spectrogram-like images, keeping 16bit magnitude instead of 8 bit, and also keeping the phase.

Moreover, the vertical size of the slices is now 256 instead of 128, so the slices are rectangular instead of square.

This code is a heavily experimental work in progress and might not work from time to time.

Required install:

eyed3
sox --with-lame
librosa
numpy
Pillow (PIL)
tensorflow
tflearn

To create the song slices (might be long):

python main.py slice

To train the classifier (long too):

python main.py train

To test the classifier (fast):

python main.py test

Most editable parameters are in the config.py file, the model can be changed in the model.py file.
I haven't implemented the pipeline to label new songs with the model, but that can be easily done with the provided functions, and eyed3 for the mp3 manipulation. Here's the full pipeline you would need to use.

Name		Name	Last commit message	Last commit date
Latest commit History 22 Commits
img		img
README.md		README.md
SpectroImageTools.py		SpectroImageTools.py
audioFilesTools.py		audioFilesTools.py
checkpoint		checkpoint
config.py		config.py
datasetTools.py		datasetTools.py
imageFilesTools.py		imageFilesTools.py
main.py		main.py
model.py		model.py
sliceSpectrogram.py		sliceSpectrogram.py
songToData.py		songToData.py

Provide feedback