Building A Feedforward Neural Network In C++: GA Class

Quantum
Quest
Algorithms, Math, and Physics

Building a feedforward neural network in C++: GA class

In my implementation of feedforward neural networks (FNNs) using genetic algorithms (GA), the ANN_MLP_GA class is derived from the base ANN_MLP class. While the base class handles the network’s structure, weights, biases, and activation functions, the ANN_MLP_GA class adds the necessary components to manage a population of networks, which is essential for GA-based optimization.

Unlike the traditional stochastic gradient descent (SGD) method, where a single network is trained through iterative weight updates, genetic algorithms work by evolving a population of networks. Each population member has its own set of weights and biases, which evolve over multiple generations to optimize performance.

Class overview

The ANN_MLP_GA class manages this population, maintaining vectors for population weights (vWeightsPop) and biases (vBiasesPop). Since GA requires multiple copies of the network for each population member, memory management becomes a crucial consideration. Each member of the population needs distinct sets of weights and biases, increasing the memory usage significantly compared to the SGD method.

The ANN_MLP_GA class includes methods like CreatePopulation and AllocatePopulation to initialize and manage this population. These methods can either generate a fixed population or create a mixed population, with an option to retain the top performers of the previous generation. The UpdateWeightsAndBiases method is responsible for transferring the weights and biases from a population member to the main network for feedforward computations.


void UpdateWeightsAndBiases(const std::vector<size_t>& v_);
void TrainGA(const std::vector<std::vector<T>>& data,
             const std::vector<std::vector<T>>& reference,
             size_t nGenerations, size_t BatchSize, bool shuffleTrainingData = true);
int TestGA(const std::vector<std::vector<T>>& data,
           const std::vector<std::vector<T>>& reference);

Genetic algorithm training

In the genetic algorithm approach, the TrainGA method is the core function that evolves the network population over generations. Each generation undergoes testing, and top performers are selected. Crossover and mutation are applied to evolve the population.

The CreatePopulationFixed method allows for fixed linear combinations of top performers, while the CreatePopulationMixed method randomly selects two top performers and creates a new population member by combining their weights and biases. This enables the algorithm to explore the search space by introducing variations while retaining the best solutions.


for (size_t i = 0; i < nGenerations; ++i)
{
    std::iota(s_.begin(), s_.end(), 0);
    if (shuffleTrainingData && !(i % 10))
    {
        std::cout << "Reshuffling training data\n";
        std::shuffle(s_.begin(), s_.end(), g_);
    }
    CreatePopulation();
    std::fill(f_.begin(), f_.end(), 0);
    for (size_t j = 0; j < std::min(s_.size(), BatchSize); ++j)
    {
        it_ = s_.begin() + (ptrdiff_t)j;
        na_[0].assign(data[*it_]);
        for (size_t k = 0; k < nPopSize; ++k)
        {
            vNaPop[k][0].assign(data[*it_]);
            for (size_t l = 1; l < nLayers; ++l)
            {
                la::MatMultVec(vNaPop[k][l], vWeightsPop[k][l - 1], vNaPop[k][l - 1]);
                vNaPop[k][l] += vBiasesPop[k][l - 1];
                nn::ActFunc(vNaPop[k][l], pAct);
            }
        }
    }
}

Each population member is tested against the dataset in mini-batches. The fitness of each network is updated based on its classification performance, and the population is sorted by fitness. Top performers are used to update the primary network parameters for the next generation.

Population creation

The CreatePopulation method can initialize the population in two ways: fixed linear combinations or mixed combinations of top performers. The CreatePopulationFixed method averages the weights and biases of two top performers to create new population members, while the CreatePopulationMixed method selects weights randomly from two top individuals. Both methods introduce mutation to the newly created members, which ensures diversity in the population and prevents premature convergence.


template <typename T>
void nn::ANN_MLP_GA<T>::CreatePopulationMixed(bool bKeepPrevious)
{
    size_t a_ = 0, b_ = 0, count_ = 0;
    if (bKeepPrevious)
        for (size_t i = 0; i < nTop; ++i)
        {
            for (size_t k = 0; k < nLayers - 1; ++k)
            {
                vBiasesPop[i][k].assign(vBiases[i][k].data());
                vWeightsPop[i][k].assign(vWeights[i][k].data());
            }
            count_++;
        }
    for (size_t i = count_; i < nPopSize; ++i)
    {
        a_ = static_cast<size_t>(GetRandomUniformInt() % static_cast<int>(nTop));
        b_ = static_cast<size_t>(GetRandomUniformInt() % static_cast<int>(nTop));
        while (a_ == b_)
            b_ = static_cast<size_t>(GetRandomUniformInt() % static_cast<int>(nTop));
        for (size_t j = 0; j < nLayers - 1; ++j)
        {
            const size_t p = (GetRandomUniformReal() < 0.5) ? a_ : b_;
            vBiasesPop[i][j].assign(vBiases[p][j].data());
            vBiasesPop[i][j] *= static_cast<T>(0.9);
            vWeightsPop[i][j].assign(vWeights[p][j].data());
            vWeightsPop[i][j] *= static_cast<T>(0.9);
        }
    }
}

Testing

The TestGA method evaluates the population by running the feedforward process for each test input, comparing the predicted output with the reference output, and calculating the number of correct predictions. Similar to SGD testing, this method provides an accuracy measure for the GA-trained network.


int nn::ANN_MLP_GA<T>::TestGA(const std::vector<std::vector<T>>& data,
                              const std::vector<std::vector<T>>& reference)
{
    int iCorrect = 0;
    for (size_t i = 1; i < data.size(); ++i)
    {
        for (size_t l = 0; l < na_[0].GetRowsNb(); ++l) na_[0][l][0] = data[i][l];
        for (size_t j = 1; j < nLayers; ++j)
        {
            la::MatMultVec(na_[j], vWeights[0][j - 1], na_[j - 1]);
            na_[j] += vBiases[0][j - 1];
            nn::ActFunc(na_[j], pAct);
        }
        const ptrdiff_t maxPos = std::distance(res.begin(), std::max_element(res.begin(), res.end()));
        const ptrdiff_t refPos = std::distance(reference[i].begin(), std::max_element(reference[i].begin(), reference[i].end()));
        if (maxPos == refPos) iCorrect++;
    }
    return iCorrect;
}

Conclusion

The ANN_MLP_GA class leverages genetic algorithms to optimize a population of feedforward neural networks, allowing exploration of a larger solution space compared to traditional gradient-based methods. For more insights into this topic, you can find the details here.

The code for this implementation is available on Github here.