This project aims to provide robust support for perceptron and feedforward neural networks (FNNs) in Ruby. It starts with the implementation of the perceptron, a basic building block of neural networks, and extends to more complex feedforward neural network architectures. This project is part of an experiment to implement these concepts across multiple programming languages, offering a practical comparison of their implementation in Ruby. Future extensions will build on this foundation to explore more advanced neural network architectures like Recurrent Neural Networks (RNNs) and beyond.
- Overview
- Feedforward Neural Networks (FNN)
- Backpropagation
- Recurrent Neural Networks (RNN)
- Installation
- Usage
- Classes and Methods
- Checklist
- Example
- Notes
- Contributing
- License
This project is an introduction to neural networks in Ruby, focusing on implementing perceptrons and feedforward neural networks. It explores the foundational concepts of neural networks, such as the perceptron model, and builds towards more complex architectures. This project is part of a broader experiment to understand and compare the implementation of neural networks in different programming languages.
Feedforward Neural Networks (FNNs) are the simplest type of artificial neural network. In an FNN, information moves in one direction—from the input layer, through the hidden layers, and finally to the output layer. There are no cycles or loops in the network. Each layer consists of neurons that apply a weighted sum of their inputs followed by an activation function to produce an output.
- Input Layer: The initial layer where data is fed into the network.
- Hidden Layers: Intermediate layers that transform the input into a form the output layer can use.
- Output Layer: The final layer that produces the output of the network.
- Activation Function: A function applied to each neuron's output to introduce non-linearity into the network, enabling it to learn more complex patterns. Common activation functions include Sigmoid, Tanh, and ReLU.
Backpropagation is the algorithm used to train feedforward neural networks. It works by calculating the gradient of the loss function with respect to each weight by the chain rule, iterating backward from the output layer to the input layer. This process allows the network to adjust its weights to minimize the error between the predicted output and the actual target.
- Loss Function: A function that measures how well the network's predictions match the target values. Common loss functions include Mean Squared Error (MSE) and Cross-Entropy.
- Gradient Descent: An optimization algorithm used to minimize the loss function by adjusting the weights in the network. Variants include Stochastic Gradient Descent (SGD), Momentum, and Adam.
- Learning Rate: A hyperparameter that controls the step size during the gradient descent update.
Recurrent Neural Networks (RNNs) are a class of neural networks where connections between neurons form a directed cycle. This architecture allows the network to maintain a memory of previous inputs, making it suitable for tasks involving sequential data, such as time series prediction, natural language processing, and speech recognition.
- Recurrent Connections: Unlike FNNs, RNNs have connections that loop back on themselves, enabling the network to maintain a state or memory of previous inputs.
- Hidden State: A vector that captures the information from previous time steps in the sequence.
- Vanishing and Exploding Gradients: Challenges associated with training RNNs using backpropagation through time (BPTT), where gradients can become very small (vanishing) or very large (exploding), making training difficult.
- Long Short-Term Memory (LSTM): A type of RNN designed to overcome the vanishing gradient problem, capable of learning long-term dependencies.
- Gated Recurrent Unit (GRU): A simpler alternative to LSTM with similar performance, using fewer parameters.
To run this code, you need to have Ruby installed on your system. You can download Ruby from the official website.
-
Clone or download this repository.
-
Navigate to the directory containing the code.
-
Modify the
topology
,trainArray
, andtestArray
variables in the script as needed. -
Run the script using Ruby:
ruby your_script_name.rb
Replace your_script_name.rb
with the name of the file containing the code.
Represents a single layer in the neural network, consisting of multiple neurons.
initialize(topology, layerNum)
: Creates a layer with the specified topology and layer number.add(neuron)
: Adds a neuron to the layer.getNeurons()
: Returns the neurons in the layer.
Represents a single neuron, including its connections to the next layer.
initialize(numOutputs, myIndex, weight_value)
: Initializes the neuron with the specified number of outputs, its index, and initial weight value.feedForward(prevLayer)
: Feeds forward the input from the previous layer, calculating the output value.calcHiddenGradients(nextLayer)
: Calculates the gradients for a hidden neuron based on the next layer.calcOutputGradients(targetVal)
: Calculates the gradient for an output neuron based on the target value.updateInputWeights(prevLayer)
: Updates the input weights from the previous layer based on the calculated gradients.getOutputVal()
: Returns the output value of the neuron.setOutputVal(n)
: Sets the output value of the neuron.getConnections()
: Returns the connections from this neuron to the next layer.getWeights()
: Returns the weights of the connections.randomWeight()
: Generates a random weight for a connection.transferFunction(x)
: Applies the transfer function (tanh) to the inputx
.transferFunctionDerivative(x)
: Applies the derivative of the transfer function tox
.
Represents the connection between two neurons, holding the weight and delta weight.
initialize(value)
: Initializes the connection with a weight value.getDW()
: Returns the delta weight.setDW(val)
: Sets the delta weight.getWeight()
: Returns the weight.setWeight(value)
: Sets the weight.
Represents the entire neural network, consisting of multiple layers.
initialize(topology)
: Initializes the network with the specified topology.feedForward(inputVals)
: Feeds input values through the network to produce an output.backPropagate(targetVals)
: Adjusts the weights of the network to minimize the error between the output and target values.getResults(resultVals)
: Gets the output results from the network.getLayers()
: Returns all the layers of the network.
A wrapper class to interact with the network.
initialize(topology)
: Initializes the computer with a neural network of the specified topology.BackPropagate(targetVals)
: Trains the network with the target values using backpropagation.feedforward(inputs)
: Feeds input values through the network.GetResult()
: Retrieves the output results from the network.getNetwork()
: Returns the network object.getWeights()
: Returns the weights of the network's connections.SetWeights(weights)
: Sets the weights of the network's connections (incomplete).
- Implemented
Layer
class with methods to manage neurons. - Implemented
Neuron
class with feedforward and backpropagation methods. - Implemented
Connection
class to handle weights and delta weights between neurons. - Implemented
Network
class to manage layers and propagate values. - Implemented
Computer
class to interact with the neural network. - Created a basic example to demonstrate network creation, training, and result retrieval.
- Explained Feedforward Neural Networks (FNN).
- Explained Backpropagation algorithm.
- Complete the
SetWeights
method in theComputer
class. - Address potential issues with the
GetResult
method in theComputer
class. - Implement or correct the
sumDOW
method to support gradient calculations. - Add additional tests and validation for edge cases.
- Optimize performance for larger networks and datasets.
- Improve documentation for complex methods and concepts.
- Explore Recurrent Neural Networks (RNN) implementation.
- Add LSTM and GRU implementations for handling sequential data.
Here is an example of how to create and train a network:
topology = [3, 3
, 3]
newComputer = Computer.new(topology)
trainArray = [0.0, 1.0, 0.0]
testArray = [1.0, 1.0, 0.0]
for i in 0..100 do
newComputer.feedforward(trainArray)
newComputer.BackPropagate(testArray)
resultVals = []
newComputer.getNetwork().getResults(resultVals)
puts(resultVals)
end
This example creates a network with 3 layers, each containing 3 neurons. It then trains the network on the trainArray
input, adjusting the weights to minimize the error compared to testArray
.
- The code has some issues that need to be addressed, such as the incomplete
sumDOW
method and potential errors in theGetResult
andSetWeights
methods. - The network uses the
tanh
function as the transfer function, and its derivative for backpropagation. - The random weight generation uses a simple normalization approach.
If you would like to contribute to this project, please fork the repository, make your changes, and submit a pull request.
This project is open-source and available under the MIT License.