脉冲神经网络是一类重要的面向神经科学的类脑计算模型,因其更接近生物神经系统信息处理机制而被视为自人工神经网络之后的下一代神经网络。在脉冲神经网络中,神经元使用脉冲信号进行通信,因此脉冲编码方式对SNN模型性能具备深刻影响。脉冲频率编码和时间编码是当前脉冲神经网络建模中的主流编码方式。然而,但由于脉冲信号在时间上是离散分布的,而这种离散脉冲分布同时受到有限的时间步数的限制,使得频率编码的特征表示分辨率通常较低。而时间编码通常对每个输出神经元的脉冲数量施加限制,通常仅使用前几个脉冲进行决策以实现快速响应,因此未充分利用脉冲活动。目前的脉冲编码方案的以上缺陷限制了SNN的应用范围,尽管在分类任务中取得了高性能,但在需要更强的脉冲特征表示能力的其他任务中(例如生成和回归),性能较低。为此,本文提出了脉冲注意力编码SAC-SNN,通过引入每个时间步长的可学习注意力系数,自然地统一了频率编码和时间编码,并灵活地学习最优系数以实现更好的性能,以解决当前SNN编码方案的局限性。同时,训练框架进一步融合了多种归一化和正则化技术,增加网络训练鲁棒性的同时可以控制学习到的注意力系数的分布。SAC-SNN可以统一不同的编码方案为一个简单的形式,依据任务场景需求灵活的适应频率编码和时间编码多种编码方式。为了验证SAC编码的脉冲神经网络的鲁棒性,本文部署了丰富的实验。不仅在图片分类任务取得了更加优异的性能,还搭建了一套脉冲神经网络作为网络编码器的变分自编码框架,在图像生成任务中表现更好。进一步的,对于一些回归任务,SAC编码的脉冲神经网络相比有限分辨率频率编码的脉冲神经网络具有更好的性能。对此,本文使用事件相机数据部署了包括固定坐标相机三轴角速度预测的回归任务和车辆驾驶过程中方向转角预测的回归任务,都取得了相比以往更加优异的结果。综上,本文的主要贡献包括提出了一种新的脉冲注意力编码(SAC)方案,可以将频率编码和时间编码灵活地统一,提高SNN的表示能力。并设计了对应的训练框架,以更好地控制学习到的注意力系数的范围和分布。在分类、生成和事件相机数据回归任务中进行了广泛的实验,证明了所提出的SAC方案具有较高特征表达分辨率,较低推理延迟和较高鲁棒性。
Spike Neural Networks (SNNs) are important brain-inspired computational models for neuroscience research. They are considered the next generation of neural networks due to their closer proximity to the information processing mechanisms of biological neural systems. In SNNs, neurons communicate using spike signals, and therefore spike encoding has a profound impact on the performance of SNN models. Currently, frequency encoding and time encoding are the mainstream encoding methods used in SNN modeling. However, the discrete distribution of spike signals in time and the limitation of a finite number of time steps in this distribution often result in low feature representation resolution for frequency encoding. This is also the reason why frequency encoding requires longer delay in maintaining high model performance. Time encoding, on the other hand, usually imposes limits on the number of spikes for each output neuron, and only uses the first few spikes for decision making to achieve fast response, thus under-utilizing spike activity. These limitations of current spike encoding schemes restrict the application scope of SNNs, achieving higher performance in classification tasks, but performing poorly in other tasks that require stronger spike feature representation capabilities, such as generation and regression.To address these limitations, this paper proposes the Spike Attention Coding SNN (SAC-SNN), which naturally unifies frequency encoding and time encoding by introducing learnable attention coefficients for each time step, and flexibly learns optimal coefficients to achieve better performance. The training framework further integrates various normalization and regularization techniques to increase the robustness of network training and control the distribution of learned attention coefficients. SAC-SNN can unify different encoding methods into a simple form and adapt to multiple encoding methods according to the task scene demand of frequency and time encoding. Ultimately, it achieves an SNN training framework with higher feature representation resolution, lower inference delay, and higher robustness.To verify the robustness of SAC coding in SNNs, this paper conducts rich experiments. It not only achieves better performance in image classification tasks, which are traditionally used to evaluate model performance in computer vision, but also builds a variational autoencoder framework with an SNN as a network encoder that performs better in image generation tasks and has higher feature representation resolution. Moreover, this paper further finds that SAC-SNN performs better than limited resolution frequency encoding SNNs in some regression tasks. To this end, this paper deploys event camera data to conduct regression tasks such as predicting fixed coordinate camera three-axis angular velocity and predicting steering angle during vehicle driving, both achieving more outstanding results than previous studies.In summary, the main contributions of this paper include proposing a new Spike Attention Coding (SAC) scheme, which flexibly unifies frequency encoding and time encoding to enhance the representation capabilities of SNNs. It also designs normalization and regularization techniques to effectively control the range and distribution of learned attention coefficients. Extensive experiments in classification, generation, and event camera data regression tasks demonstrate that the proposed SAC scheme has higher feature representation resolution, lower inference delay, and higher robustness.