配备惯性测量单元(IMU)的智能戒指以其小巧轻便的体积和贴近手指的设备形态,天然适合日常穿戴与感知手部交互动作。然而,智能戒指的传感能力受限于尺寸和传感器精度,使得其实现对用户手部动作的精细准确感知面临挑战。本研究基于“构建与加入物理先验知识”、“传感器信息融合”和“用户个性化学习”三条解决路径,提出三项新的智能戒指交互技术,旨在从人机两个方面获取更多传感器外的信息以辅助手部动作识别,进而提升机器学习方法的准确性和效率。(1)本研究提出了 MouseRing,一种基于智能 IMU 戒指的任意平面触摸板输入技术。我们总结了若干食指运动所满足的物理约束,并将其作为先验知识融入机器学习算法,实现了准确、鲁棒的指尖平面运动追踪。 MouseRing 输入技术在用户实验中达到了与笔记本电脑触摸板接近的输入效率(658.5 毫秒对 629.1 毫秒),比飞鼠等便携式指向设备具有更轻的交互负担。(2)本研究提出了一种基于用户个性化数据增强的高精度手势识别技术,旨在解决传统机器学习方法在新用户上性能下降的问题。我们利用变分自编码器将手势数据编码为手势语义向量、用户风格向量和用户内差异向量,解析了用户手势数据中的信息构成。通过向量的拼接和解码,我们实现了可控、可解释的个性化数据增强。该方法在用户进行 1 次动作输入校准后,将模型准确率从 92.5% 显著提升至 94.7%。(3)本研究提出了一种将智能戒指 IMU 数据与虚拟现实头显视觉数据融合的手势识别与手部追踪技术。我们提出了基于全卷积网络的多通道传感器融合框架,有效地解决了纯视觉方案在面对遮挡、快速交互等情况时稳定性不足的缺点。该算法对 8 种捏合事件的最优检测精度达到 96.1%。
Smart rings based on Inertial Measurement Units are naturally suited for perceiving hand interactions due to their compact and lightweight form factor that closely aligns with the fingers. However, the sensing capabilities of smart rings are constrained by size and sensor accuracy, posing challenges in achieving precise recognition of user hands. This paper proposes three novel interaction technologies based on constructing and incorporating prior knowledge, sensor information fusion, and user-personalized learning, aiming to gather more information from both human and machine aspects to enhance the accuracy and efficiency of sensing systems.(1) We propose MouseRing to enable always-available touchpad interactions based on IMU rings. We summarize the physical constraints satisfied by index finger movements and integrate them as prior knowledge into machine learning algorithms, achieving accurate and robust fingertip motion tracking. MouseRing achieves input efficiency close to that of laptop touchpads (658.5 ms vs. 629.1 ms), with a lighter interaction burden compared to air mice.(2) We propose a high-precision gesture recognition technology based on user-personalized data augmentation. We aim to address the performance degradation of machine learning methods with new users. We employ a variational autoencoder to encode gesture data into gesture semantic vectors, user style vectors, and user intra-difference vectors, elucidating the information composition of gesture data. Thereby, we achieve controllable and interpretable user-personalized data augmentation. This method significantly boosts model accuracy from 92.5% to 94.7% using one piece of new user data.(3) We propose a gesture detection and hand tracking technology that fuses smart ring IMU data with virtual reality head-mounted display visual data. We introduce a multi-channel sensor fusion framework based on fully convolutional networks, effectively addressing the instability issues of pure visual solutions when faced with occlusion, rapid interactions, and other scenarios. The algorithm achieves an optimal detection accuracy of 96.1% for eight pinch events.