Variational Autoencoder for Sound Generation

This repository is a neural network architecture for generating sound using Variational Autoencoders. The encoder and decoder components of the VAE are made using Convolutional Neural Net. The audio is first converted into spectrograms, and then fed to the network. The network then gives a spectogram as well, which is converted back into Audio for the final output.

Dataset

This Variational Autoencoder was trained on the Free Spoken Digit Dataset (FSDD) to produce sounds of spoken digits. However, it can also be trained on other such sound datasets like Nsynth and MAESTRO by google to produce musical notes.

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
data		data
model		model
Autoencoder_Model.ipynb		Autoencoder_Model.ipynb
Preprocess.ipynb		Preprocess.ipynb
README.md		README.md
Sound_Generator.ipynb		Sound_Generator.ipynb
autoencoder.py		autoencoder.py
generate.py		generate.py
preprocess.py		preprocess.py
soundgenerator.py		soundgenerator.py
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Variational Autoencoder for Sound Generation

Dataset

About

Releases

Packages

Languages

Jarvis1000x/Variational_Autoencoder_for_Sound_Generation

Folders and files

Latest commit

History

Repository files navigation

Variational Autoencoder for Sound Generation

Dataset

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages