Access the full text.
Sign up today, get DeepDyve free for 14 days.
Hong Qiao, Yinlin Li, Tang Tang, Peng Wang (2013)
Introducing Memory and Association Mechanism Into a Biologically Inspired Visual ModelIEEE Transactions on Cybernetics, 44
Wenbing Zhang, Q. Han, Yang Tang, Yurong Liu (2019)
Sampled-data control for a class of linear time-varying systemsAutom., 103
Volodymyr Mnih, Adrià Badia, Mehdi Mirza, Alex Graves, T. Lillicrap, Tim Harley, David Silver, K. Kavukcuoglu (2016)
Asynchronous Methods for Deep Reinforcement Learning
Alfredo Clemente, Humberto Martínez, A. Chandra (2017)
Efficient Parallel Methods for Deep Reinforcement LearningArXiv, abs/1705.04862
S. Ghazbi, Y. Aghli, M. Alimohammadi, A. Akbari (2016)
Quadrotors Unmanned Aerial Vehicles: A ReviewInternational Journal on Smart Sensing and Intelligent Systems, 9
Nathan Michael, S. Shen, K. Mohta, Yash Mulgaonkar, Vijay Kumar, K. Nagatani, Yoshito Okada, Seiga Kiribayashi, Kazuki Otake, Kazuya Yoshida, K. Ohno, E. Takeuchi, S. Tadokoro (2012)
Collaborative mapping of an earthquake‐damaged building via ground and aerial robotsJournal of Field Robotics, 29
Katie Kang, Suneel Belkhale, G. Kahn, P. Abbeel, S. Levine (2019)
Generalization through Simulation: Integrating Simulated and Real Data into Deep Reinforcement Learning for Vision-Based Autonomous Flight2019 International Conference on Robotics and Automation (ICRA)
Tuomas Haarnoja, Aurick Zhou, P. Abbeel, S. Levine (2018)
Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic ActorArXiv, abs/1801.01290
Giuseppe Loianno, Vijay Kumar (2018)
Cooperative Transportation Using Small Quadrotors Using Monocular Vision and Inertial SensingIEEE Robotics and Automation Letters, 3
Chongzhen Zhang, Jianrui Wang, G. Yen, Chaoqiang Zhao, Qiyu Sun, Yang Tang, F. Qian, Jürgen Kurths (2020)
When Autonomous Systems Meet Accuracy and Transferability through AI: A SurveyPatterns, 1
Che Wang, K. Ross (2019)
Boosting Soft Actor-Critic: Emphasizing Recent Experience without Forgetting the PastArXiv, abs/1906.04009
Mohammad Dehghani, M. Menhaj (2016)
Takagi-Sugeno system for supervisory formation control of seeker mounted unmanned aerial vehiclesAssembly Automation, 36
Jigen Peng, Hong Qiao, Zongben Xu (2002)
A new approach to stability of neural networks with time-varying delaysNeural networks : the official journal of the International Neural Network Society, 15 1
Volodymyr Mnih, K. Kavukcuoglu, David Silver, Alex Graves, Ioannis Antonoglou, Daan Wierstra, Martin Riedmiller (2013)
Playing Atari with Deep Reinforcement LearningArXiv, abs/1312.5602
Xiaotai Wu, Yang Tang, Jinde Cao (2019)
Input-to-State Stability of Time-Varying Switched Systems With Time DelaysIEEE Transactions on Automatic Control, 64
Tianhao Zhang, G. Kahn, S. Levine, P. Abbeel (2015)
Learning deep control policies for autonomous aerial vehicles with MPC-guided policy search2016 IEEE International Conference on Robotics and Automation (ICRA)
H. Lau, I. Lee (2001)
Assembly skill acquisition via reinforcement learningAssembly Automation, 21
T. Lillicrap, Jonathan Hunt, A. Pritzel, N. Heess, Tom Erez, Yuval Tassa, David Silver, Daan Wierstra (2015)
Continuous control with deep reinforcement learningCoRR, abs/1509.02971
R. Yap (2004)
Logic for Learning: Learning Comprehensible Theories from Structured Data by John W. Lloyd, Springer-Verlag, 2003, hard cover: ISBN 3-540-42027-4, x + 256 pagesTheory and Practice of Logic Programming, 4
Assembly Automation, 24
Minghui Zhao, Xian Guo, Xuebo Zhang, Yongchun Fang, Y. Ou (2019)
ASPW-DRL: assembly sequence planning for workpieces via a deep reinforcement learning approachAssembly Automation
M. Jaradat, M. Al-Rousan, Lara Quadan (2011)
Reinforcement based mobile robot navigation in dynamic environmentRobotics and Computer-integrated Manufacturing, 27
Matthew Hausknecht, P. Stone (2015)
Deep Recurrent Q-Learning for Partially Observable MDPsArXiv, abs/1507.06527
Jemin Hwangbo, Inkyu Sa, R. Siegwart, M. Hutter (2017)
Control of a Quadrotor With Reinforcement LearningIEEE Robotics and Automation Letters, 2
J. Williams, S. Young (2007)
Partially observable Markov decision processes for spoken dialog systemsComput. Speech Lang., 21
Yanan Li, K. Tee, Rui Yan, S. Ge (2019)
Reinforcement learning for human-robot shared controlAssembly Automation
T. Schaul, John Quan, Ioannis Antonoglou, David Silver (2015)
Prioritized Experience ReplayCoRR, abs/1511.05952
Seongchan Kim, Seungkyun Hong, M. Joh, Sa-kwang Song (2017)
DeepRain: ConvLSTM Network for Precipitation Prediction using Multichannel Radar DataArXiv, abs/1711.02316
S. Hochreiter, J. Schmidhuber (1997)
Long Short-Term MemoryNeural Computation, 9
V. Andaluz, C. Gallardo, Fernando Chicaiza, Christian Carvajal, J. Morales, Giovanny Cuzco, V. Morales, B. Vaca, Nicolay Samaniego (2018)
Robot nonlinear control for Unmanned Aerial Vehicles’ multitaskingAssembly Automation
Scott Fujimoto, H. Hoof, D. Meger (2018)
Addressing Function Approximation Error in Actor-Critic Methods
Sergei Lupashin, Angela Schoellig, M. Sherback, R. D’Andrea (2010)
A simple learning strategy for high-speed quadrocopter multi-flips2010 IEEE International Conference on Robotics and Automation
Zidong Wang, H. Qiao, K. Burnham (2002)
On stabilization of bilinear uncertain time-delay stochastic systems with Markovian jumping parametersIEEE Trans. Autom. Control., 47
Ben Kehoe, S. Patil, P. Abbeel, Ken Goldberg (2015)
A Survey of Research on Cloud Robotics and AutomationIEEE Transactions on Automation Science and Engineering, 12
Guillaume Sartoretti, Yue Wu, William Paivine, T. Kumar, Sven Koenig, H. Choset (2018)
Distributed Reinforcement Learning for Multi-robot Decentralized Collective Construction
R. Bogue (2017)
Cloud robotics: a review of technologies, developments and applicationsInd. Robot, 44
Yang Tang, Chaoqiang Zhao, Jianrui Wang, Chongzhen Zhang, Qiyu Sun, Weixing Zheng, W. Du, F. Qian, J. Kurths (2020)
An Overview of Perception and Decision-Making in Autonomous Systems in the Era of LearningarXiv: Computer Vision and Pattern Recognition
Yuanda Wang, Haibo He, Changyin Sun (2018)
Learning to Navigate Through Complex Dynamic Environment With Modular Deep Reinforcement LearningIEEE Transactions on Games, 10
This work aims to combine the cloud robotics technologies with deep reinforcement learning to build a distributed training architecture and accelerate the learning procedure of autonomous systems. Especially, a distributed training architecture for navigating unmanned aerial vehicles (UAVs) in complicated dynamic environments is proposed.Design/methodology/approachThis study proposes a distributed training architecture named experience-sharing learner-worker (ESLW) for deep reinforcement learning to navigate UAVs in dynamic environments, which is inspired by cloud-based techniques. With the ESLW architecture, multiple worker nodes operating in different environments can generate training data in parallel, and then the learner node trains a policy through the training data collected by the worker nodes. Besides, this study proposes an extended experience replay (EER) strategy to ensure the method can be applied to experience sequences to improve training efficiency. To learn more about dynamic environments, convolutional long short-term memory (ConvLSTM) modules are adopted to extract spatiotemporal information from training sequences.FindingsExperimental results demonstrate that the ESLW architecture and the EER strategy accelerate the convergence speed and the ConvLSTM modules specialize in extract sequential information when navigating UAVs in dynamic environments.Originality/valueInspired by the cloud robotics technologies, this study proposes a distributed ESLW architecture for navigating UAVs in dynamic environments. Besides, the EER strategy is proposed to speed up training processes of experience sequences, and the ConvLSTM modules are added to networks to make full use of the sequential experiences.
Assembly Automation – Emerald Publishing
Published: Jul 22, 2021
Keywords: Unmanned aerial vehicles; Deep reinforcement learning; Cloud robotics; Dynamic navigation; Learning and adaptive systems
Read and print from thousands of top scholarly journals.
Already have an account? Log in
Bookmark this article. You can see your Bookmarks on your DeepDyve Library.
To save an article, log in first, or sign up for a DeepDyve account if you don’t already have one.
Copy and paste the desired citation format or use the link below to download a file formatted for EndNote
Access the full text.
Sign up today, get DeepDyve free for 14 days.
All DeepDyve websites use cookies to improve your online experience. They were placed on your computer when you launched this website. You can change your cookie settings through your browser.