登录 EN

添加临时用户

AniDance-基于音乐、动画交互的智能舞蹈动作生成平台

AniDance: An Intelligent Dance Motion Synthesis Platform based on Music and Animation Interaction

作者:唐韬然
  • 学号
    2015******
  • 学位
    硕士
  • 电子邮箱
    189******com
  • 答辩日期
    2018.05.23
  • 导师
    贾珈
  • 学科名
    计算机科学与技术
  • 页码
    60
  • 保密级别
    公开
  • 培养单位
    024 计算机系
  • 中文关键词
    多感官人机交互,三维动作数据集,动作预测,长短期记忆神经网络
  • 英文关键词
    Multi-sensory human-computer interaction, 3D motion dataset, motion synthesis, LSTM

摘要

舞蹈有助于和谐地调动听觉、运动和视觉感官,提高人的学习能力、促进大脑发育。在人类发展史上,舞蹈受音乐的强烈影响,人们通常根据音乐的情感来进行舞蹈动作表达,这一形式被广泛应用于社交、礼仪、节日活动中。研究表明,在艺术教学中建立同时调动多种感官的教学形式有助于提高人类的认知能力与学习兴趣。目前,神经网络被广泛应用于音乐情感识别和动作预测,为我们建立这种交互形式创造了可能。因此,本文提出一套舞蹈生成的交互应用及其算法,能够对音乐辅助舞蹈教学、音频游戏中人物的运动生成和人类行为研究等领域起到推进作用。舞蹈表演很大程度上受音乐的影响,以往的研究虽然在学习音乐与舞蹈之间的关系方面付出了大量的努力,但是如何建立一套交互新颖、能够基于音乐生成专业舞蹈动作的动画生成平台依然是一个有待解决的开放性问题。对此,我们主要面临四个方面的挑战:1)如何同时协调人的多种感官,参与到人机交互中;2)如何根据音乐的要求选择合适的舞蹈动作,即专业舞蹈教学中的标准动作;3)如何根据音乐对舞蹈动作进行艺术化处理;4)在模型训练中,我们缺少足够的数据集。为了解决这些问题,我们建立了基于音乐、动画交互的智能舞蹈动作生成平台AniDance,让三维人物能够随着用户的歌声起舞。我们的主要研究工作如下:一、提出了一套基于音乐、动画交互的人机交互界面。它包含系统控制界面、音乐输入界面、动作展示界面,根据用户输入的歌声、音乐输出舞蹈动画,并辅助用户模仿动作,建立基于听觉、视觉、运动觉的多感官交互体验。二、提出了一种使用自编码器的长短期记忆神经网络模型(LSTM-AutoEncoder)。挖掘声学特征与动作的内在模式之中的关联,建立了声学特征与动作特征的映射关系,并利用时序索引和遮掩层改进模型,优化了动作预测效果。三、构建了已知最大的音乐驱动下的三维舞蹈动作数据集。针对现有数据集伴奏信息不完整、缺少前后动作对应信息等问题,构建了一个包含四种舞蹈共计907200帧的三维舞蹈动作及伴奏的数据集。并提取多维特征参与模型训练与优化。四、进行了一系列定量实验和定性实验来优化算法模型,并验证AniDance的动作生成效果。此外,在实验过程中,我们探讨了结合损失函数与用户评价的模型优化方法的意义。根据实验结果,在充分调动用户多种感官参与的基础上,AniDance被证明能够根据音乐生成合理而专业的舞蹈动作。

Dance helps people harmoniously utilize their auditory, motor, and visual senses, leading to the promotion of both their learning ability and brain development. In the history of human development, people usually perform dance movements based on the emotions of music, a dancing form widely used in social, ceremonial, and festival activities. Researches have shown that art teaching with a multi-sensory form can help people improve their learning ability and perception. Meanwhile, existing computer technology has created possibility for innovations in human-computer interaction. Therefore, this paper proposed a multi-sensory human-computer interaction with a music-oriented dance motion synthesis algorithm, which can be beneficial in fields such as music-aided dance teaching, motion generation for characters in audio games, and researches on human behaviors.Dance is greatly influenced by music. Although previous studies have spent great efforts on learning the relationship between music and dance, it remains an open problem about how to synthesize the appropriate music-oriented dance choreography through a creative interaction. There are currently four major challenges: 1) How to realize a multi-sensory human-computer interaction. 2) How to choose appropriate dance figures based on music as the standard actions for professional dance teaching; 3) How to modify the choreography aesthetically as the music changes; 4) A lack of data for model training. To solve these problems, in this paper, we proposed a multi-sensory interaction with a music-oriented dance choreography synthesis algorithm that allows 3D characters to make dance motions along with users' singing or music. Our study can be summarized as: 1. Used the LSTM-AutoEncoder model to dig out the inner patterns between acoustic features and motion features. 2. Improved the model with Temporal Indexes and Masking Method to achieve better performance. 3. Constructed a music-dance dataset for four types of dancing choreographies containing a total of 907,200 frames of 3D dance motions with accompanying music, and extracted multidimensional features from it for model training.4. Conducted several qualitative and quantitative experiments to select the most well-fitting model, which proved our model to be effective and efficient in synthesizing reasonable choreographies for expressing the music.