绘本是学龄前儿童最重要的读物之一,传统绘本依赖成年人的阅读引导,限制了孩子的创造力与情感表达。电子绘本也给带来了问题:一方面,新颖的交互方式的通常是为了趣味性而不提升注意力与认知理解能力。另一方面,交互式绘本过分关注绘本外观上的美学,但是儿童与绘本交互过程的美感却被忽视了。针对传统绘本和交互绘本存在的局限性,本设计从交互美学着手,探索如何提升儿童阅读的专注度与认知记忆效果,如何让儿童更好的理解绘本故事传达的信息以及如何鼓励孩子在绘本里表达自己的情感。基于以上目标,本文研究了基于交互美学的儿童绘本设计与实现,提出了面向儿童电子绘本的PCA(Perception Cognition Affection,PCA,感知认知情感)交互美学评估模型,构建了基于交互美学的儿童电子绘本系统Icoobook。实验结果表明:当儿童阅读故事内容相同的Icoobook和传统纸质绘本时,Icoobook相比传统纸质绘本,儿童的阅读的注意力提升了60%,认知效果提升了39.10%,阅读情感满意度提升了65.75%。本文的贡献点可以从如下两个层面概括:本文提出了基于儿童电子绘本的PCA交互美学概念模型,该模型不仅为交互美学儿童电子绘本提供设计方法,还为用户实验评估提供了标准。该模型内容如下:第一,在感知层,本文设计了“点击动画”的交互模式,该设计联动儿童的视觉,听觉和触觉,提升儿童在绘本阅读中的专注度;第二,在认知层面,本文设计了 “神笔马良”和“连笔绘画”的交互模式,让儿童在“感知—模仿—理解—表达”的交互过程中强化儿童的认知,提升儿童在绘本阅读中的认知记忆效果;第三,在情感层,本文设计了 “诗歌朗读”的交互模式分析儿童朗读时的语音情感,通过诗歌配图的即时色彩反馈,形象化地引导儿童表达情感,让儿童逐步感受到诗歌的意象美。本文采用了一个基于多路径深度神经网络的智能语音情感识别算法,识别儿童朗读诗歌时的情感状态。在该算法框架中,先将文本特征以及语音特征放入局部子网络分组训练,并取得局部分类结果;之后将各个子网络的卷积特征拼接成一个高层特征并放入全局分类器;最终的情感识别结果是所有局部分类器和全局分类器输出的加权组合。模型整体共享一个目标函数,采用端到端的训练方法。通过这种多路径网络的架构,可以有效地避免因输入特征维度过高带来的梯度消失,进而有效地加强网络学习能力,提升情感识别的精度。
Picture books are considered beneficial for children in many ways, e.g., to help improve their concentration and memory while reading. However, traditional picture books are limited in paper media and adults' guidance is usually required in daily use, which can not avoid to limit the creativity and emotional expression of children to some extent. Recently, more and more electronic picture books are coming into people's view, such as mixed-media picture books, touch and feel picture books, game picture books and so on. While these electronic picture books have two major disadvantages: on one hand, their fancy interaction style is generally designed for entertainment instead of improving the attention and understanding on the content of picture books; on the other hand, only the beauty of product's appearance is taken into account in their design, but aesthetics of affective interaction in children's reading process is always ignored.To solve these, we propose a novel PCA (Perception & Cognition & Affection) model from the prospective of aesthetics of interaction. And based on this, we establish a new electronic interactive picture book for children, named Icoobook. In details, our product is designed sequentially according to the following three levels. First at the level of perception, it provides interfaces of multi-sensory interaction including clicking to animation (vision), clicking to sounds (hearing), finger painting (touch) and so on. Second at the level of cognition, it builds some immersive interactive scenes including “games of linking strokes” in which users can help characters in the picture book to go through obstacles by linking suitable strokes, and “magic brush” in which users can give characters in the picture book some useful things (or food) by graffiti, etc. Third at the level of affection, it creates some high-level interaction modes based on emotion recognition including “reading poetry” in which users read poetry with different emotions (e.g, happy) and then characters in the picture book will get corresponding style of poetry's matching photos (e.g, warm), etc. The experimental results show that: Icoobook is more concentrated than the traditional paper picture book, the frequency of distraction is reduced by 60%, the cognitive effect is increased by 39.10%, and the emotional satisfaction of reading is increased by 60.00%Main contributions of this work can be organized as follows. Firstly, we propose a novel model, PCA, that can provide clear guidance not only in the stage of designing products from the prospective of interaction, but also in the stage of user study. Secondly, we establish a news electronic picture book, Icoobook, that can help children not only to put much more attention on reading, but also to get better understanding about the context, and furthermore can guide children to appreciate the beauty of deep affective interaction. Thirdly, we propose a novel supervised multi-path deep neural network framework. Unlike the existing works that employ the whole features as input and train them in a single classifier, the proposed framework train raw features by groups in local classifiers to avoid high dimensional. Then high-level features of each local classifier is concatenated as input of a global classifier. More importantly, these two kinds classifiers are trained simultaneously through a single objective function to achieve a more effective and discriminative emotion inferring.