Search for: [Keywords = "reinforcement learning"]

Efficient learning variable impedance control for industrial robots

C. Li Z. Zhang G. Xia X. Xie Q. Zhu

Bulletin of the Polish Academy of Sciences Technical Sciences | 2019 | 67 | No. 2 | 201-212 | DOI: 10.24425/bpas.2019.128116

Keywords variable impedance control reinforcement learning efficient Gaussian process industrial robots

Download PDF Download RIS Download Bibtex

Abstract

Compared with the robots, humans can learn to perform various contact tasks in unstructured environments by modulating arm impedance characteristics. In this article, we consider endowing this compliant ability to the industrial robots to effectively learn to perform repetitive force-sensitive tasks. Current learning impedance control methods usually suffer from inefficiency. This paper establishes an efficient variable impedance control method. To improve the learning efficiency, we employ the probabilistic Gaussian process model as the transition dynamics of the system for internal simulation, permitting long-term inference and planning in a Bayesian manner. Then, the optimal impedance regulation strategy is searched using a model-based reinforcement learning algorithm. The effectiveness and efficiency of the proposed method are verified through force control tasks using a 6-DoFs Reinovo industrial manipulator.

Go to article

Authors and Affiliations

C. Li

Z. Zhang

G. Xia

X. Xie

Q. Zhu

Approximate dynamic programming in robust tracking control of wheeled mobile robot

Zenon Hendzel Marcin Szuster

Archive of Mechanical Engineering | 2009 | vol. 56 | No 3 | 223-236 | DOI: 10.24425/ame.2009.132098

Keywords wheeled mobile robot approximate dynamic programming reinforcement learning robust tracking control actor-critic structure

Download PDF Download RIS Download Bibtex

Abstract

In this work, a novel approach to designing an on-line tracking controller for a nonholonomic wheeled mobile robot (WMR) is presented. The controller consists of nonlinear neural feedback compensator, PD control law and supervisory element, which assure stability of the system. Neural network for feedback compensation is learned through approximate dynamic programming (ADP). To obtain stability in the learning phase and robustness in face of disturbances, an additional control signal derived from Lyapunov stability theorem based on the variable structure systems theory is provided. Verification of the proposed control algorithm was realized on a wheeled mobile robot Pioneer–2DX, and confirmed the assumed behavior of the control system.

Go to article

Authors and Affiliations

Zenon Hendzel

Marcin Szuster

CNC Machine Control Using DeepReinforcement Learning

Dawid Kalandyk Bogdan Kwiatkowski Damian Mazur

Bulletin of the Polish Academy of Sciences Technical Sciences | Early Access | e148940 | DOI: 10.24425/bpasts.2024.148940

Keywords deep reinforcement learning cnc machining machining optimization

Download PDF Download RIS Download Bibtex

Abstract

Optimization of industrial processes such as manufacturing or processing of specific materials is a point of interest for many researchers, and its application can lead not only to speeding up the processes in question, but also to reducing the energy cost incurred during them. This article presents a novel approach to optimizing the spindle motion of a computer numeric control (CNC) machine. The proposed solution is to use deep learning with reinforcement to map the performance of the Reference Points Realization Optimization (RPRO) algorithm used in industry. A detailed study was conducted to see how well the proposed method performs the targeted task. In addition, the influence of a number of different factors and hyperparameters of the learning process on the performance of the trained agent was investigated. The proposed solution achieved very good results, not only satisfactorily replicating the performance of the benchmark algorithm, but also, speeding up the machining process and providing significantly higher accuracy.

Go to article

Authors and Affiliations

Dawid Kalandyk

Bogdan Kwiatkowski

e-mail:

ORCID:

Damian Mazur

e-mail:

ORCID:

Millimeter Wave Beamforming Training: A Reinforcement Learning Approach

Ehab Mahmoud Mohamed

International Journal of Electronics and Telecommunications | 2021 | vol. 67 | No 1 | 95-102 | DOI: 10.24425/ijet.2021.135949

Keywords millimeter wave beamforming training multiarmed bandit reinforcement learning

Download PDF Download RIS Download Bibtex

Abstract

Beamforming training (BT) is considered as an essential process to accomplish the communications in the millimeter wave (mmWave) band, i.e., 30 ~ 300 GHz. This process aims to find out the best transmit/receive antenna beams to compensate the impairments of the mmWave channel and successfully establish the mmWave link. Typically, the mmWave BT process is highly-time consuming affecting the overall throughput and energy consumption of the mmWave link establishment. In this paper, a machine learning (ML) approach, specifically reinforcement learning (RL), is utilized for enabling the mmWave BT process by modeling it as a multi-armed bandit (MAB) problem with the aim of maximizing the long-term throughput of the constructed mmWave link. Based on this formulation, MAB algorithms such as upper confidence bound (UCB), Thompson sampling (TS), epsilon-greedy (e-greedy), are utilized to address the problem and accomplish the mmWave BT process. Numerical simulations confirm the superior performance of the proposed MAB approach over the existing mmWave BT techniques.

Go to article

Authors and Affiliations

Ehab Mahmoud Mohamed

1 2

Electrical Engineering Dept., College of Engineering, Prince Sattam Bin Abdulaziz University, Wadi Aldwaser 11991, Saudi Arabia
Electrical Engineering Dept., Faculty of Engineering Aswan University, Aswan 81542, Egypt

Adaptive controller design for electric drive with variable parameters by Reinforcement Learning method

T. Pajchrowski P. Siwek A. Wójcik

Bulletin of the Polish Academy of Sciences Technical Sciences | 2020 | 68 | No. 5 (i.a. Special Section on Modern control of drives and power converters) | 1019-1030 | DOI: 10.24425/bpasts.2020.134667

Keywords Reinforcement Learning adaptive control electric drive machine learning

Download PDF Download RIS Download Bibtex

Abstract

The paper presents a method for designing a neural speed controller with use of Reinforcement Learning method. The controlled object is an electric drive with a synchronous motor with permanent magnets, having a complex mechanical structure and changeable parameters. Several research cases of the control system with a neural controller are presented, focusing on the change of object parameters. Also, the influence of the system critic behaviour is researched, where the critic is a function of control error and energy cost. It ensures long term performance stability without the need of switching off the adaptation algorithm. Numerous simulation tests were carried out and confirmed on a real stand.

Go to article

Authors and Affiliations

T. Pajchrowski

P. Siwek

A. Wójcik

e-mail:

ORCID:

Self-improving Q-learning based controller for a class of dynamical processes

Jakub Musial Krzysztof Stebel Jacek Czeczot

Archives of Control Sciences | 2021 | vol. 31 | No 3 | 527-551 | DOI: 10.24425/acs.2021.138691

Keywords process control Q-learning algorithm reinforcement learning intelligent control on-line learning

Download PDF Download RIS Download Bibtex

Abstract

This paper presents how Q-learning algorithm can be applied as a general-purpose selfimproving controller for use in industrial automation as a substitute for conventional PI controller implemented without proper tuning. Traditional Q-learning approach is redefined to better fit the applications in practical control loops, including new definition of the goal state by the closed loop reference trajectory and discretization of state space and accessible actions (manipulating variables). Properties of Q-learning algorithm are investigated in terms of practical applicability with a special emphasis on initializing of Q-matrix based only on preliminary PI tunings to ensure bumpless switching between existing controller and replacing Q-learning algorithm. A general approach for design of Q-matrix and learning policy is suggested and the concept is systematically validated by simulation in the application to control two examples of processes exhibiting first order dynamics and oscillatory second order dynamics. Results show that online learning using interaction with controlled process is possible and it ensures significant improvement in control performance compared to arbitrarily tuned PI controller.

Go to article

Bibliography

[1] H. Boubertakh, S. Labiod, M. Tadjine and P.Y. Glorennec: Optimization of fuzzy PID controllers using Q-learning algorithm. Archives of Control Sciences, 18(4), (2008), 415–435
[2] I.Carlucho, M. De Paula, S.A. Villar and G.G.Acosta: Incremental Qlearning strategy for adaptive PID control of mobile robots. Expert Systems With Applications, 80, (2017), 183–199, DOI: 10.1016/j.eswa.2017.03.002.
[3] K. Delchev: Simulation-based design of monotonically convergent iterative learning control for nonlinear systems. Archives of Control Sciences, 22(4), (2012), 467–480.
[4] M. Jelali: An overview of control performance assessment technology and industrial applications. Control Eng. Pract., 14(5), (2006), 441–466, DOI: 10.1016/j.conengprac.2005.11.005.
[5] M. Jelali: Control Performance Management in Industrial Automation: Assessment, Diagnosis and Improvement of Control Loop Performance. Springer-Verlag London, (2013)
[6] H.-K. Lam, Q. Shi, B. Xiao, and S.-H. Tsai: Adaptive PID Controller Based on Q-learning Algorithm. CAAI Transactions on Intelligence Technology, 3(4), (2018), 235–244, DOI: 10.1049/trit.2018.1007.
[7] D. Li, L. Qian, Q. Jin, and T. Tan: Reinforcement learning control with adaptive gain for a Saccharomyces cerevisiae fermentation process. Applied Soft Computing, 11, (2011), 4488–4495, DOI: 10.1016/j.asoc.2011.08.022.
[8] M.M. Noel and B.J. Pandian: Control of a nonlinear liquid level system using a new artificial neural network based reinforcement learning approach. Applied Soft Computing, 23, (2014), 444–451, DOI: 10.1016/j.asoc.2014.06.037.
[9] T. Praczyk: Concepts of learning in assembler encoding. Archives of Control Sciences, 18(3), (2008), 323–337.
[10] M.B. Radac and R.E. Precup: Data-driven model-free slip control of antilock braking systems using reinforcement Q-learning. Neurocomputing, 275, (2017), 317–327, DOI: 10.1016/j.neucom.2017.08.036.
[11] A.K. Sadhu and A. Konar: Improving the speed of convergence of multi-agent Q-learning for cooperative task-planning by a robot-team. Robotics and Autonomous Systems, 92, (2017), 66–80, DOI: 10.1016/j.robot.2017.03.003.
[12] N. Sahebjamnia, R. Tavakkoli-Moghaddam, and N. Ghorbani: Designing a fuzzy Q-learning multi-agent quality control system for a continuous chemical production line – A case study. Computers & Industrial Engineering, 93, (2016), 215–226, DOI: 10.1016/j.cie.2016.01.004.
[13] K. Stebel: Practical aspects for the model-free learning control initialization. in Proc. of 2015 20th International Conference on Methods and Models in Automation and Robotics (MMAR), Poland, (2015), DOI: 10.1109/MMAR.2015.7283918.
[14] R.S. Sutton and A.G. Barto: Reinforcement learning: An Introduction, MIT Press, (1998)
[15] S. Syafiie, F. Tadeo, and E. Martinez: Softmax and "-greedy policies applied to process control. IFAC Proceedings, 37, (2004), 729–734, DOI: 10.1016/S1474-6670(16)31556-2.
[16] S. Syafiie, F. Tadeo, and E. Martinez: Model-free learning control of neutralization process using reinforcement learning. Engineering Applications of Artificial Intelligence, 20, (2007), 767–782, DOI: 10.1016/j.engappai.2006.10.009.
[17] S. Syafiie, F. Tadeo, and E. Martinez: Learning to control pH processes at multiple time scales: performance assessment in a laboratory plant. Chemical Product and Process Modeling, 2(1), (2007), DOI: 10.2202/1934- 2659.1024.
[18] S. Syafiie, F. Tadeo, E. Martinez, and T. Alvarez: Model-free control based on reinforcement learning for a wastewater treatment problem. Applied Soft Computing, 11, (2011), 73–82, DOI: 10.1016/j.asoc.2009.10.018.
[19] P. Van Overschee and B. De Moor: RAPID: The End of Heuristic PID Tuning. IFAC Proceedings, 33(4), (2000), 595–600, DOI: 10.1016/S1474- 6670(16)38308-8.
[20] M. Wang, G. Bian, and H. Li: A new fuzzy iterative learning control algorithm for single joint manipulator. Archives of Control Sciences, 26(3), (2016), 297–310. DOI: 10.1515/acsc-2016-0017.
[21] Ch.J.C.H. Watkins and P. Dayan: Technical Note: Q-learning. Machine Learning, 8, (1992), 279–292, DOI: 10.1023/A:1022676722315.

Go to article

Authors and Affiliations

Jakub Musial

1

Krzysztof Stebel

1

Jacek Czeczot

1

Silesian University of Technology, Faculty of Automatic Control, Electronics and Computer Science, Department of Automatic Control and Robotics, 44-100 Gliwice, ul. Akademicka 16, Poland

Design optimization of obstacle avoidance of intelligent building the steel bar by integrating reinforcement learning and BIM technology

Hong Chai Junchao Guo

Archives of Civil Engineering | 2024 | vol. 70 | No 1 | 621-634 | DOI: 10.24425/ace.2024.148932

Keywords reinforcement learning building information modeling reinforcement steel bar precast concrete elements markov decision BIM information

Download PDF Download RIS Download Bibtex

Abstract

In promoting the construction of prefabricated residential buildings in Yunnan villages and towns, the use of precast concrete elements is unstoppable. Due to the dense arrangement of steel bars at the joints of precast concrete elements, collisions are prone to occur, which can affect the stress of the components and even pose certain safety hazards for the entire construction project. Because the commonly used the steel bar obstacle avoidance method based on building information modeling has low adaptation rate and cannot change the trajectory of the steel bar to avoid collision, a multi-agent reinforcement learning-based model integrating building information modeling is proposed to solve the steel bar collision in reinforced concrete frame. The experimental results show that the probability of obstacle avoidance of the proposed model in three typical beam-column joints is 98.45%, 98.62% and 98.39% respectively, which is 5.16%, 12.81% and 17.50% higher than that of the building information modeling. In the collision-free path design of the same object, the research on the path design of different types of precast concrete elements takes about 3–4 minutes, which is far less than the time spent by experienced structural engineers on collision-free path modeling. The experimental results indicate that the model constructed by the research institute has good performance and has certain reference significance.

Go to article

Authors and Affiliations

Hong Chai

1

e-mail:

ORCID:

Junchao Guo

1

e-mail:

ORCID:

YellowRiver Conservancy Technical Institute, Department of Civil Engineering and Transportation Engineering, 475000 Kaifeng, China

Influence of IQT on research in ICT

Bogdan J. Bednarski Łukasz E. Lepak Jakub J. Łyskawa Paweł Pieńczuk Maciej Rosoł Ryszard S. Romaniuk

International Journal of Electronics and Telecommunications | 2022 | vol. 68 | No 2 | 259-266 | DOI: 10.24425/ijet.2022.139876

Keywords ICT control theory IQT Information Quantum Technologies Quantum 2.0 applications of IQT quantum systems qubit neural networks quantum time series forecasting Quantum Reinforcement Learning

Download PDF Download RIS Download Bibtex

Abstract

This paper is written by a group of Ph.D. students pursuing their work in different areas of ICT, outside the direct area of Information Quantum Technologies IQT. An ambitious task was undertaken to research, by each co-author, a potential practical influence of the current IQT development on their current work. The research of co-authors span the following areas of ICT: CMOS for IQT, QEC, quantum time series forecasting, IQT in biomedicine. The intention of the authors is to show how quickly the quantum techniques can penetrate in the nearest future other, i.e. their own, areas of ICT.

Go to article

Authors and Affiliations

Bogdan J. Bednarski

1

Łukasz E. Lepak

1

Jakub J. Łyskawa

1

Paweł Pieńczuk

1

Maciej Rosoł

1

Ryszard S. Romaniuk

1

Warsaw University of Technology, Warsaw, Poland

Search results

Filters

Search results

Efficient learning variable impedance control for industrial robots

Abstract

Authors and Affiliations

Approximate dynamic programming in robust tracking control of wheeled mobile robot

Abstract

Authors and Affiliations

CNC Machine Control Using DeepReinforcement Learning

Abstract

Authors and Affiliations

Millimeter Wave Beamforming Training: A Reinforcement Learning Approach

Abstract

Authors and Affiliations

Adaptive controller design for electric drive with variable parameters by Reinforcement Learning method

Abstract

Authors and Affiliations

Self-improving Q-learning based controller for a class of dynamical processes

Abstract

Bibliography

Authors and Affiliations

Design optimization of obstacle avoidance of intelligent building the steel bar by integrating reinforcement learning and BIM technology

Abstract

Authors and Affiliations

Influence of IQT on research in ICT

Abstract

Authors and Affiliations