登录 EN

添加临时用户

人体骨肌系统知识图谱构建及应用

The Construction of the Human Musculoskeletal System‘s Knowledge Graph and its Applications

作者:杨然
  • 学号
    2020******
  • 学位
    硕士
  • 电子邮箱
    leo******com
  • 答辩日期
    2023.05.22
  • 导师
    王学谦
  • 学科名
    电子信息
  • 页码
    80
  • 保密级别
    公开
  • 培养单位
    025 自动化系
  • 中文关键词
    人体骨肌系统, 知识图谱, 信息抽取, 数据增强, 人体步态分析
  • 英文关键词
    human musculoskeletal system, knowledge graph, information extraction, data augmentation, human gait analysis

摘要

全球每年因运动损伤导致的身体、健康和寿命损害人数庞大,然而目前国内从事运动康复相关专业的人员数量有限。在查询康复相关知识时,需要从大量杂乱无章的信息中筛选出有价值的信息,这一过程效率低下且结果不准确。知识图谱是一种将大量数据整理成图状结构的技术,用节点表示概念或实体,用边表示节点之间的关系。它能使计算机根据这些知识进行推理和决策。本文研究的方法和成果在运动康复领域具有广泛应用前景,如辅助专业人员制定运动康复方案、推荐个性化康复训练等。此外,知识图谱技术还可扩展至其他医学领域,为构建更全面的医学知识体系提供支持。本文的研究重点和成果如下:1. 构建了用于三元组抽取的骨肌系统中文数据集。鉴于目前中文开源人体骨肌系统数据集稀缺,本文根据相关书籍和资料构建了骨肌系统及康复领域的数据集,包括骨骼、肌肉、关节以及肌肉测试和训练方法等方面的信息。并提出了一种基于三元组数据集非实体同义词转换的数据增强算法。针对目前没有用于已标注的三元组实体关系抽取数据集的数据增强算法,提出了一种对文本中非实体文本的同义词替换方法,实现在不改变文本中实体和实体之间关系的情况下对文本数据进行增强。2. 提出了一种针对人体骨肌系统知识图谱的构建方法,包括骨肌系统和运动康复领域数据集的收集、本体构建、信息抽取、知识融合等方法。在构建方法中改进了一种基于字符匹配的信息抽取方法。通过引入拼音特征表达提高词嵌入的语义表达能力,通过在字符串匹配标签识别过程中引入双向机制提高标签对上下文的理解,通过使用基于大量中文临床医学语料库训练的预训练模型,提高模型在运动医学的中文信息抽取方面的性能。实验结果表明,改进后的信息抽取方法在实体识别和关系抽取任务上取得了较好的效果。3. 结合建立好的知识图谱实现了两个应用。第一个应用是利用千例步态数据计算出人体在步行过程中各关节运动的标准角度和范围,结合构建的知识图谱实现了对测试者下肢肌肉紧张程度和激活程度的评估。第二个应用是实现了一个基于规则的肌肉测试和训练方案生成的知识图谱问答系统。

Every year, a large number of people worldwide suffer from physical, health, and lifespan damages due to sports injuries, while the number of professionals in the sports rehabilitation field in China remains limited. When searching for rehabilitation-related knowledge, filtering valuable information from a vast amount of disorganized information is necessary, but this process is often inefficient and inaccurate. Knowledge graphs are a technology that organizes massive data into graph structures, using nodes to represent concepts or entities and edges to represent relationships between nodes. This technology enables computers to make inferences and decisions based on the knowledge. The methods and results of this study have broad application prospects in the field of sports rehabilitation, such as assisting professionals in formulating sports rehabilitation programs and recommending personalized rehabilitation training. Additionally, knowledge graph technology can extend to other medical fields, providing support for building more comprehensive medical knowledge systems. The research focus and results of this paper are as follows:1. A Chinese dataset for triple extraction of the musculoskeletal system was constructed. Given the scarcity of open-source Chinese datasets for the human musculoskeletal system, this study constructed a dataset for the musculoskeletal and rehabilitation fields based on relevant books and materials, including information on bones, muscles, joints, muscle testing, and training methods. A data augmentation algorithm based on non-entity synonym conversion of triple datasets was proposed. In the absence of a data augmentation algorithm for annotated triple entity relationship extraction datasets, a method for replacing non-entity text with synonyms in the text was proposed to enhance the text data without changing the relationship between entities in the text.2. A method for constructing a knowledge graph for the human musculoskeletal system was proposed. This method includes data collection for the musculoskeletal system and motion rehabilitation fields, ontology construction, information extraction, knowledge fusion. A character-based information extraction method was improved based on this method. The semantic representation ability of word embedding was improved by introducing Pinyin feature expression, the understanding of tags on the context was improved by introducing a bidirectional mechanism in the string-matching label recognition process, and the performance of the model in extracting Chinese information in sports medicine was improved by using a pre-training model trained on a large corpus of Chinese clinical medical language. Experimental results showed that the improved information extraction method achieved good performance on entity recognition and relationship extraction tasks.3. Utilized a thousand gait data samples to calculate the standard angles of joint movements during human walking, combined with the constructed knowledge graph to assess the tension and actiuvation levels of lower limb muscles in test subjects, and provided a rule-based knowledge graph question-answer example used for muscle testing and training protocol generation.