Skip to content

A Neural Network architecture for generating sound using Variational Autoencoders

Notifications You must be signed in to change notification settings

Jarvis1000x/Variational_Autoencoder_for_Sound_Generation

Repository files navigation

Variational Autoencoder for Sound Generation

This repository is a neural network architecture for generating sound using Variational Autoencoders. The encoder and decoder components of the VAE are made using Convolutional Neural Net. The audio is first converted into spectrograms, and then fed to the network. The network then gives a spectogram as well, which is converted back into Audio for the final output.

Dataset

This Variational Autoencoder was trained on the Free Spoken Digit Dataset (FSDD) to produce sounds of spoken digits. However, it can also be trained on other such sound datasets like Nsynth and MAESTRO by google to produce musical notes.

About

A Neural Network architecture for generating sound using Variational Autoencoders

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published