├─env # Gym env implementation of Inverted Pendulum
├─figures # Saved images and result gif
│ G.txt # Text file to save G
│ getGH.m # MATLAB file to get G and H
│ H.txt # Text file to save H
│ LQR.py # LQR generalized implementation
│ README.md # README
│ test_gym.py # Run LQR in the Inverted Pendulum environment
│ utils.py # Some public functions
To run the LQR controller for the inverted pendulum system:
-
Ensure you have Python installed on your system (Python 3.6 or higher is recommended).
-
Install the required dependencies:
pip install numpy matplotlib gym
-
Navigate to the project directory in your terminal or command prompt.
-
Run the
test_gym.py
file:python test_gym.py
This will execute the LQR controller in the Inverted Pendulum environment and display the results.
The simplified model of the first-order inverted pendulum system is shown in Figure 1. This model consists of a cart moving horizontally and a connected single pendulum. Some physical parameters are listed in the table below.
(Note: In the figure, theta
represents the angle between the pendulum and the vertical upward direction, which is the same as
Figure 1: First-order inverted pendulum system model
Property | Value |
---|---|
Cart mass (M) | 0.5 kg |
Cart damping (b) | 0.1 s$^{-1}$ |
Pendulum mass (m) | 0.1 kg |
Half-length of pendulum (l) | 0.3 m |
Pendulum moment of inertia (I) | 0.012 kg*m$^2$ |
Gravitational acceleration (g) | 9.8 m/s$^2$ |
Sampling period (tau) | 0.005 s |
Dynamic equations for the pendulum and cart:
Where
The complete dynamic equations (1) and (2) are discretized and used as the state transition equations in the Gym simulation:
- Calculate
$\ddot{x}(k)$ and$\ddot{\theta}(k)$ at the current time step k:
Where
- Calculate all state variables at the current time step:
-
Angle convention: In the equations,
$\theta$ represents the angle between the pendulum and the vertical downward direction. In the program,theta
represents the angle between the pendulum and the vertical upward direction. Adjust the calculation of$\ddot{\theta}(k)$ accordingly. -
Angle wrapping: When using an incremental method to calculate
theta
, consider the discontinuity near$\pi$ (vertical downward position). The numerical values oftheta
(i.e.,$\phi$ in the figure) correspond to positions as shown in Figure 2.
Figure 2: Angle position correspondence diagram
Apply the following constraint to
The nonlinear equations cannot be directly used with LQR for state-space modeling and optimal control problem solving. Therefore, we need to linearize the nonlinear dynamic equations.
Let
Based on the linearized dynamic equations, we obtain the continuous-domain state-space expression
-
After obtaining the continuous state-space equation
$\dot{X}=AX+Bu$ , use MATLAB'sc2d
function to convert it to the discrete state-space$X(k+1)=Gx(k)+Hu(k)$ :[G,H]=c2d(A,B,Ts) % Ts is the sampling period
-
In LQR,
$\mathbf{F}_t=[G\ H], \mathbf{f}_t=\mathbf{0}$ . After parameter tuning, we set$\mathbf{C}_t=diag(10\ 15\ 30\ 6\ 1), \mathbf{c}_t=\mathbf{0}$ . -
For dynamic LQR control based on state feedback, control prediction and actual control are performed as follows: Assuming the current time is k, with known observed state
$X(k)$ , perform control prediction for LQR control duration T, but apply only the first t steps of the predicted control sequence. At time k+t, based on the observed state$X(k+t)$ , perform LQR control prediction for the next k+t+1 to k+t+T control outputs. We use T=100 and t=15.
Since we linearized the nonlinear state equations near the vertical upward position of the pendulum, we choose initial states where the pendulum deviates slightly from the vertical upward direction. Tests show that the system converges stably when the initial state
Figure 3: LQR control result with initial 36° deviation from vertical upward direction