Search results

Filters

  • Journals
  • Authors
  • Keywords
  • Date
  • Type

Search results

Number of results: 1
items per page: 25 50 75
Sort by:
Download PDF Download RIS Download Bibtex

Abstract

This paper presents how Q-learning algorithm can be applied as a general-purpose selfimproving controller for use in industrial automation as a substitute for conventional PI controller implemented without proper tuning. Traditional Q-learning approach is redefined to better fit the applications in practical control loops, including new definition of the goal state by the closed loop reference trajectory and discretization of state space and accessible actions (manipulating variables). Properties of Q-learning algorithm are investigated in terms of practical applicability with a special emphasis on initializing of Q-matrix based only on preliminary PI tunings to ensure bumpless switching between existing controller and replacing Q-learning algorithm. A general approach for design of Q-matrix and learning policy is suggested and the concept is systematically validated by simulation in the application to control two examples of processes exhibiting first order dynamics and oscillatory second order dynamics. Results show that online learning using interaction with controlled process is possible and it ensures significant improvement in control performance compared to arbitrarily tuned PI controller.
Go to article

Bibliography

[1] H. Boubertakh, S. Labiod, M. Tadjine and P.Y. Glorennec: Optimization of fuzzy PID controllers using Q-learning algorithm. Archives of Control Sciences, 18(4), (2008), 415–435
[2] I.Carlucho, M. De Paula, S.A. Villar and G.G.Acosta: Incremental Qlearning strategy for adaptive PID control of mobile robots. Expert Systems With Applications, 80, (2017), 183–199, DOI: 10.1016/j.eswa.2017.03.002.
[3] K. Delchev: Simulation-based design of monotonically convergent iterative learning control for nonlinear systems. Archives of Control Sciences, 22(4), (2012), 467–480.
[4] M. Jelali: An overview of control performance assessment technology and industrial applications. Control Eng. Pract., 14(5), (2006), 441–466, DOI: 10.1016/j.conengprac.2005.11.005.
[5] M. Jelali: Control Performance Management in Industrial Automation: Assessment, Diagnosis and Improvement of Control Loop Performance. Springer-Verlag London, (2013)
[6] H.-K. Lam, Q. Shi, B. Xiao, and S.-H. Tsai: Adaptive PID Controller Based on Q-learning Algorithm. CAAI Transactions on Intelligence Technology, 3(4), (2018), 235–244, DOI: 10.1049/trit.2018.1007.
[7] D. Li, L. Qian, Q. Jin, and T. Tan: Reinforcement learning control with adaptive gain for a Saccharomyces cerevisiae fermentation process. Applied Soft Computing, 11, (2011), 4488–4495, DOI: 10.1016/j.asoc.2011.08.022.
[8] M.M. Noel and B.J. Pandian: Control of a nonlinear liquid level system using a new artificial neural network based reinforcement learning approach. Applied Soft Computing, 23, (2014), 444–451, DOI: 10.1016/j.asoc.2014.06.037.
[9] T. Praczyk: Concepts of learning in assembler encoding. Archives of Control Sciences, 18(3), (2008), 323–337.
[10] M.B. Radac and R.E. Precup: Data-driven model-free slip control of antilock braking systems using reinforcement Q-learning. Neurocomputing, 275, (2017), 317–327, DOI: 10.1016/j.neucom.2017.08.036.
[11] A.K. Sadhu and A. Konar: Improving the speed of convergence of multi-agent Q-learning for cooperative task-planning by a robot-team. Robotics and Autonomous Systems, 92, (2017), 66–80, DOI: 10.1016/j.robot.2017.03.003.
[12] N. Sahebjamnia, R. Tavakkoli-Moghaddam, and N. Ghorbani: Designing a fuzzy Q-learning multi-agent quality control system for a continuous chemical production line – A case study. Computers & Industrial Engineering, 93, (2016), 215–226, DOI: 10.1016/j.cie.2016.01.004.
[13] K. Stebel: Practical aspects for the model-free learning control initialization. in Proc. of 2015 20th International Conference on Methods and Models in Automation and Robotics (MMAR), Poland, (2015), DOI: 10.1109/MMAR.2015.7283918.
[14] R.S. Sutton and A.G. Barto: Reinforcement learning: An Introduction, MIT Press, (1998)
[15] S. Syafiie, F. Tadeo, and E. Martinez: Softmax and "-greedy policies applied to process control. IFAC Proceedings, 37, (2004), 729–734, DOI: 10.1016/S1474-6670(16)31556-2.
[16] S. Syafiie, F. Tadeo, and E. Martinez: Model-free learning control of neutralization process using reinforcement learning. Engineering Applications of Artificial Intelligence, 20, (2007), 767–782, DOI: 10.1016/j.engappai.2006.10.009.
[17] S. Syafiie, F. Tadeo, and E. Martinez: Learning to control pH processes at multiple time scales: performance assessment in a laboratory plant. Chemical Product and Process Modeling, 2(1), (2007), DOI: 10.2202/1934- 2659.1024.
[18] S. Syafiie, F. Tadeo, E. Martinez, and T. Alvarez: Model-free control based on reinforcement learning for a wastewater treatment problem. Applied Soft Computing, 11, (2011), 73–82, DOI: 10.1016/j.asoc.2009.10.018.
[19] P. Van Overschee and B. De Moor: RAPID: The End of Heuristic PID Tuning. IFAC Proceedings, 33(4), (2000), 595–600, DOI: 10.1016/S1474- 6670(16)38308-8.
[20] M. Wang, G. Bian, and H. Li: A new fuzzy iterative learning control algorithm for single joint manipulator. Archives of Control Sciences, 26(3), (2016), 297–310. DOI: 10.1515/acsc-2016-0017.
[21] Ch.J.C.H. Watkins and P. Dayan: Technical Note: Q-learning. Machine Learning, 8, (1992), 279–292, DOI: 10.1023/A:1022676722315.
Go to article

Authors and Affiliations

Jakub Musial
1
Krzysztof Stebel
1
Jacek Czeczot
1

  1. Silesian University of Technology, Faculty of Automatic Control, Electronics and Computer Science, Department of Automatic Control and Robotics, 44-100 Gliwice, ul. Akademicka 16, Poland

This page uses 'cookies'. Learn more