发声交流是许多动物社会行为过程中重要的信息传递方式。在小鼠中,多种社交行为都伴随着超声波发声(USVs)。小鼠单次呼吸过程发出的声音片段被称为音节,其有多样的结构和宽泛的调频范围。为了研究小鼠超声交流中音节的发声动机与功能,我们从机器学习、计算机视觉、行为学和皮层成像等角度开展研究。我们首先构建了套包含4,464个USVs音节的频谱图像分割数据库,并开发了一套基于深度学习的音节识别和分割算法。经过5倍交叉验证,我们模型在音节分割任务中的IoU(Intersection over Union)达到0.8019 ± 0.0032。在音节识别任务中,我们的模型实现了0.9348 ± 0.017的准确率和0.9230 ± 0.0086的召回率,达到了state-of-the-art。我们进一步构建了卷积变分自动编码器,使用embedding对50,609个音节进行低维表征,结果发现小鼠不同音节特征呈现渐变的过程。利用深度学习,我们对小鼠社交行为进行了精细分析,结合音节embedding的结果,发现音节的特征与USVs发出者的行为状态密切相关。求偶过程中雄鼠的骑乘行为,以及领地竞争的雄鼠-雄鼠骑乘行为中,都会发出复杂变调和谐波结构的音节,说明其表达了一种优势社交状态的信息,而发情期的雌鼠对此类音节有明显且特异性的偏好。于此同时,我们还发现哺乳期母鼠对频率跳跃音节也有一定的偏好性。然而,在更加复杂的社会等级测试中,我们并没有发现USVs的明确功能。通过对自由社交状态小鼠的脑成像,我们发现皮层中存在对音节特异性响应的神经元,与纯频声音响应神经元相比,其采用更加稀疏和不确定的方式进行编码。这些结果表明小鼠USVs音节传递了具有特异性的但有限的交流信息。为了能够对自由社交小鼠进行脑成像,我们还研发了一套无线信号传输式超小荧光显微镜,其质量为2.7 g,采样率25 Hz,分辨率为640×480 pixel。综上所述,我们针对小鼠社会交流过程中发出的USVs,开发了一套识别和分析软件,结合行为的精细分析,初步揭示了USVs中不同音节对于小鼠交流过程中的功能。此外,无线超小显微镜为研究社交活动提供了更加强大的工具。
Vocal communication plays an important role in information transfer during social interactions of mammals. In mice, ultrasonic vocalizations (USVs) are relevant to many social behaviors. Syllables are basic elements of USVs produced on a single breath, featured by variable structure and frequency modulation (FM) range. In this study, we combined machine learning, computational vision, behavioral experiments, and cortical imaging, to investigate the vocal motivation and function of syllables in mice ultrasonic communication.We constructed a set of image segmentation database containing 4,464 USVs syllables, and developed a deep learning-based syllable recognition algorithm to detect and segment syllables in USVs. After 5-fold cross-validation, our model reached 0.8019 ± 0.0032 IoU (Intersection over Union). In the syllable recognition, our model reached an accuracy of 0.9348 ± 0.017 and a recall of 0.9230 ± 0.009, achieving state-of-the-art. Further, a convolutional variational autoencoder was constructed to characterize 50,609 syllables using embedding, and it was found that the different syllable features of mice showed a gradual process of change.Using deep learning combined with syllable embedding, we finely analyzed the social behavior of mice and found that the syllable features were closely related to the behavioral state of USVs emitters. The complex and harmonic structure of syllables emitted during the male-female mounting behavior of courtship, as well as in the male-male mounting behavior of territorial competition, suggests a dominant social status. And female mice in estrus have a specific preference for such syllables. We also found a preference for frequency pitch jump syllables with in lactating females. However, we did not find a significant function for USVs in the more complex social behavior, such as hierarchy test. Using brain imaging of free-moving mice, we found the presence of neurons in the cortex that respond specifically to syllables, which are encoded in a sparser and random manner compared to pure tone response neurons. These results suggest that mice syllables convey specific but limited communication information. Moreover, we developed a wireless signal miniscope with a mass of 2.7 g, a sampling rate of 25 Hz, and a resolution of 640 × 480 pixel, enable the brain imaging of multiple freely moving mice.Overall, we developed the detection and analysis methods for USVs emitted during social communication in mice, and combined with the fine analysis of behaviors, we initially revealed the functions of different syllables in USVs for the mice social communication. In addition, the wireless miniscope provides a powerful tool for studying social communication.