In this project, a controller based on Q-Learning is used to regulate a system which is common in temperature control problems.
 A first-order system

Search code, repositories, users, issues, pull requests...

submited by
Style Pass
2024-09-20 22:30:07

In this project, a controller based on Q-Learning is used to regulate a system which is common in temperature control problems. A first-order system with time delay (PT1 with delay) is used to model the system response. This is a simple control system, but it is non-linear because of the delay. It is also a good enough approximation for many real-world control problems, e.g., temperature control systems with a single energy capacity (section Control Problem). A human can quickly estimate how far the result is from the best possible sequence of actions, by just looking at a plot (section Optimal Policy). The most common approach for these control problems is a PI or PID controller. Although this works very well, it is not the optimal policy. That means even for a simple control problem like this, machine learning could improve the performance and energy usage of many devices in use today.

Q-Learning is used to train a neural network to solve the control problem. The result is compared with the optimal policy and the result of a PID-Controller (tuned according to standard practice). Three variants of the same Q-Learning Algorithm are compared. The main motivation for the second and third version is to later extend them to (Soft) Actor-Critic methods.

Leave a Comment