Feedback control for a pendulum on a cart
In this blog post, I discuss my method for stabilizing the inverted pendulum on a cart system by combining neural networks (NN) with genetic algorithms (GA). The inverted pendulum is a classic problem in control theory, known for its inherent instability. The system consists of a cart of mass M and a pendulum of mass m, attached by a rigid, massless rod of length L. The cart moves along a straight line under the influence of an external force F = u, and the angle of the pendulum, denoted by \theta, changes in response to this motion.
Governing equations
The motion of the cart and pendulum can be described using Lagrangian mechanics. The Lagrangian L is derived from the difference between the kinetic and potential energies of the system:
T - V = \frac{1}{2} (M + m) \dot{x}^2 + m L \cos(\theta) \dot{x} \dot{\theta} + \frac{1}{2} m L^2 \dot{\theta}^2 + m L g \cos(\theta)
From the Lagrangian, the equations of motion are obtained as:
(M + m) \ddot{x} + m L \cos(\theta) \ddot{\theta} - m L \sin(\theta) \dot{\theta}^2 = - \mu \dot{x} + u
and
m L ( \cos(\theta) \ddot{x} + L \ddot{\theta} + g \sin(\theta)) = 0
This leads to the coupled system of equations for the acceleration of the cart (\ddot{x}) and the angular acceleration of the pendulum (\ddot{\theta}), which can be written as a system of first-order equations.
Linearization
Linearizing the system around equilibrium points allows me to design a feedback controller. For the pendulum, the system is linearized around \theta = 0 (the upright position) and \theta = \pi (the downward position). The linearized equations for the upright position are:
\frac{d}{dt} \begin{bmatrix} x \\\\ v \\\\ \theta \\\\ \omega \end{bmatrix} = \mathbf{A} \begin{bmatrix} x \\\\ v \\\\ \theta \\\\ \omega \end{bmatrix} + \mathbf{B} u
where \mathbf{A} and \mathbf{B} are matrices that depend on system parameters like M, m, L, and g. For example, for the upward equilibrium position, the matrix \mathbf{A} is:
\mathbf{A} = \begin{bmatrix} 0 & 1 & 0 & 0 \\\\ 0 & -\frac{\mu}{M} & \frac{mg}{M} & 0 \\\\ 0 & 0 & 0 & 1 \\\\ 0 & -\frac{\mu}{ML} & \frac{(M+m)g}{ML} & 0 \end{bmatrix}
By designing a feedback control law u = -\mathbf{K} \mathbf{x}, where \mathbf{K} is the feedback gain matrix, I stabilize the system by placing the poles of the closed-loop system at desired locations.
Neural network and genetic algorithm
To further optimize control, I implemented a neural network (NN) trained using a genetic algorithm (GA). The NN takes the system’s state—cart position (x), velocity (\dot{x}), pendulum angle (\theta), and angular velocity (\dot{\theta})—as inputs and computes the control force u. The NN is trained through a GA to minimize the total error over time, which is the deviation of the state from the reference equilibrium.
The state inputs are normalized to ensure consistent scaling:
inputs_[0] = x_ / XDIVISOR;
inputs_[1] = xDot_ / XDIVISOR;
inputs_[2] = theta_ / TDIVISOR;
inputs_[3] = thetaDot_ / TDIVISOR;
The genetic algorithm evolves over multiple generations. Each neural network in the population is evaluated based on its performance, measured by how well it stabilizes the pendulum. The networks with the best fitness scores are selected to breed the next generation, and mutation is applied to introduce variations. This evolutionary approach ensures that the NN continually improves its control performance.
The control law combines the NN output with a feedback mechanism to generate the force u:
params_copy.u = -nnKGain * (nn_out_[0] * (x_ - control::Params<T>().y_ref[0]) +
nn_out_[1] * (xDot_ - control::Params<T>().y_ref[1]) +
nn_out_[2] * (theta_ - control::Params<T>().y_ref[2]) +
nn_out_[3] * (thetaDot_ - control::Params<T>().y_ref[3]));
Conclusion
This project demonstrates how a combination of classical feedback control and machine learning techniques like neural networks and genetic algorithms can be used to stabilize a complex dynamic system like an inverted pendulum on a cart. The approach I implemented offers a versatile and powerful solution to controlling highly nonlinear systems. For more insights into this topic, you can find the details here.
The code for this implementation is available on Github here.