登录 EN

添加临时用户

基于强化学习的水下机械臂抓取系统设计

Reinforcement Learning-Based Design of an Underwater Robotic Manipulator for Enhanced Gripping Capabilities

作者:刘宇恒
  • 学号
    2020******
  • 学位
    硕士
  • 电子邮箱
    105******com
  • 答辩日期
    2023.05.15
  • 导师
    陈道毅
  • 学科名
    电子信息
  • 页码
    96
  • 保密级别
    公开
  • 培养单位
    599 国际研究生院
  • 中文关键词
    水下机械臂,强化学习,双重延迟策略更新算法,仿真实验
  • 英文关键词
    underwater robotic arm, reinforcement learning, twin delayed deep deterministic policy gradients,simulation experiment

摘要

随着人类社会的飞速发展,人类对于自然资源的需求正不断增加,广阔的海洋中有着丰富的油气资源与生物资源,世界各国也在近年加大了对海洋的探索。但水下环境极其恶劣和复杂,海洋中的作业任务通常需要借助水下机械臂来完成。面对复杂的水下环境,传统的控制方案面临着实验成本高、环境适应性差、自动化水平低的困境。随着人工智能技术的发展,通过强化学习的方法,可以使得机械臂在对应环境下具备自主学习的能力,在复杂的环境任务中通过不断学习训练建立智能体,进而完成复杂任务下的决策。本文针对水下机械臂抓取这一实际问题,通过搭建Jetson+FPGA的硬件实验平台,结合PCO Edge水下高速相机+Realsense相机组成的水下视觉系统,在ROS框架上完成了一套水下机械臂抓取实验系统。并使用了双重延迟更新策略算法搭建了机械臂仿真环境,通过仿真实验确定了各项模型参数并训练得到智能体模型,通过编写节点构建完成了仿真环境到真实环境的交互,并最终通过实验验证了模型效果。针对水下不同任务可能遇到的障碍物问题,本文总结了水下作业环境的特点,提出了环境模型匹配加动态点惩罚的方案,通过计算机视觉识别环境调用不同的模型,并在模型中加入动态惩罚点模拟障碍物实现两级避障机制。实验表明,针对本文所述的不同水下抓取环境,模型在仿真环境中的成功率均达到了90%以上,体现出双重延迟更新策略算法在水下机械臂抓取应用上效果显著且具备一定的泛化能力。在真实实验环境中,系统在各类实验情况下的成功率均达到了80%以上,验证了整体系统的可行性。结果表明,本文所述的机械臂控制系统方案与强化学习算法模型能够较好地帮助水下机械臂完成对应抓取任务。

With the rapid development of human society, the human demand for natural resources is increasing, the vast ocean is rich in oil and gas resources and biological resources, the world has also increased the exploration of the ocean in recent years. But the underwater environment is extremely harsh and complex, so the operational tasks in the ocean usually need to be completed with the help of underwater robotic arms. Facing the complex underwater environment, the traditional control scheme faces the dilemma of high experimental cost, poor environmental adaptability and low automation level. With the development of artificial intelligence technology, through the method of reinforcement learning, it is possible to make the robotic arm have the ability of autonomous learning in the corresponding environment, and build up an intelligent body through continuous learning training in complex environmental tasks, and then complete the decision making under complex tasks.In this paper, we built a hardware experimental platform of Jetson+FPGA, combined with the underwater vision system composed of PCO Edge underwater high-speed camera + Realsense camera, and completed a set of underwater robotic arm grasping experimental system on the ROS framework, and used the TD3 algorithms to build a robotic arm simulation environment. The model parameters were determined and the intelligent body model was trained through simulation experiments, and the interaction from the simulation environment to the real environment was completed by writing node construction, and the model effect was finally verified through experiments.For the problem of obstacles that may be encountered in different tasks underwater, this paper summarizes the characteristics of the underwater operating environment, and proposes a scheme of environment model matching plus dynamic point penalty, which invokes different models through computer vision to identify the environment, and adds dynamic penalty points in the model to simulate obstacles to achieve a two-level obstacle avoidance mechanism. The experiments show that the success rate of the models in the simulation environment for the different underwater grasping environments described in this paper reaches more than 90%, reflecting the significant effect and certain generalization ability of the TD3 algorithm for underwater robotic arm grasping applications. In the real experimental environment, the success rate of all kinds of experimental situations reached more than 80%, which verified the feasibility of the overall system. The results show that the robotic arm control system scheme and the reinforcement learning algorithm model described in this paper can help the underwater robotic arm to accomplish the corresponding grasping tasks.