边缘深度神经网络计算可以为人工智能应用带来低延迟和高准确率的优势。但是,边缘计算场景下,随着用户的信息和数据在边缘计算设备之间流转,用户隐私面临被侵犯的风险。传统的通用隐私保护方法可能会带来较高的延迟或准确率损失,这违背了边缘智能应用对于计算时延和模型精度的要求。为了解决边缘计算场景下隐私强度、计算时延和模型精度无法兼顾的问题,本文在深度神经网络模型的三个计算阶段,提出了解决以上挑战的深度神经网络边缘计算方法。 1.模型训练:本文针对边缘计算中联邦学习训练深度神经网络模型过程中隐私特征泄露的问题,提出了隐私特征保护的深度神经网络边缘联邦学习机制。该方法包含一个三阶段联邦学习结构,即转换-聚合-重建的机制,使联邦聚合运行于无特征的数据之上;该方法还包含非信息性转换算法,使得联邦学习中各客户端交换的中间更新,转换为非信息性数据。该方法确保了联邦学习的中间更新不再泄露隐私数据和隐私特征,并且不会带来准确度损失与明显的额外时间开销。论文研究成果已成功应用于国家XXX部门,证实了其有效性。 2.模型授权: 本文针对深度神经网络预训练模型授权过程中的知识产权侵犯和隐私泄露问题,提出了隐私保护的深度神经网络参数边缘授权机制,可以防止模型盗窃和数据特征泄露。该方法提出了一个面向模型参数的混淆机制,可以消除模型参数数据分布特征的同时,支持推理样本在混淆模型上的正向传播计算。该方法不会影响模型准确率,并且保持了低延迟的特性。 3.模型推理: 本文针对深度神经网络推理过程中样本数据和特征的隐私泄露问题,提出了隐私保护的深度神经网络边缘推理机制。该方法提出了一个网络结构的预分析框架,能够自动构建正向传播图、构建最大线性子图,并将其交给神经网络专用加速芯片进行计算,非线性子图则交给可信执行环境(TEE)进行计算。同时,该方法还提出了一个快速加密线性子图输入数据的算法,通过预生成的元密钥,构建加密密钥与解密密钥,以实现快速的加解密。其密文可以直接经由异构的神经网络加速芯片进行计算,确保了适用于边缘计算的低延迟特性。 三个创新性的工作,覆盖了深度神经网络模型“训练-授权-推理”三个阶段,不仅可以保护隐私特征,而且能够满足边缘计算对于低延迟和高准确率的性能需求,为隐私敏感的边缘人工智能研究与应用提供了新的思路。
Edge deep neural network computing can bring the benefits of low latency and high accuracy to AI applications. However, edge computing scenarios expose users to the risk of privacy violations as their information and data flow between edge computing devices. Traditional generic privacy protection methods may bring high latency or accuracy loss, which defies the requirements of edge intelligence applications for computational latency and model accuracy. In order to solve the problem that privacy strength, computational latency and model accuracy cannot be reconciled in edge computing scenarios, this paper proposes new deep neural network edge computing methods to solve the above challenges in three computational phases of deep neural network models. Model training: In this paper, we propose a privacy-preserving mechanism against privacy feature leakages in edge federated learning. The method contains a three-stage federated learning structure, i.e., a conversion-aggregation-reconstruction mechanism, which makes federated aggregation run on non-informative data. The method also contains a non-informative conversion algorithm, which makes the intermediate updates exchanged by each client in federated learning, converted to non-informative data. The method ensures that the intermediate updates of federated learning no longer disclose private data and private features, and do not introduce accuracy loss with significant additional time overhead. The research have been successfully applied to the national XXX sector, where its effectiveness has been confirmed. Model Authorization: In this paper, we propose a privacy-preserving deep neural network parameter authorization mechanism to prevent model theft and data feature leakage in the process of pre-training model edge authorization. The method proposes a model parameter-oriented obfuscation mechanism that can eliminate the model parameter data distribution features while supporting the computation of forward propagation of inference samples on the obfuscated model. The method does not affect the model accuracy and maintains the low latency property. Model inference: In this paper, we propose a privacy-preserving deep neural network inference mechanism to address the privacy leakage of sample data and features during edge inference. The method proposes a pre-analytic framework for network structure that can automatically construct forward propagation graphs, construct maximum linear subgraphs, and assign them to neural network accelerators for computation speedup, and nonlinear subgraphs to a trusted execution environment (TEE) for computation. The method also proposes an algorithm for fast encryption of linear subgraph input data by constructing encryption key and decryption key with pre-generated meta-keys. Its ciphertext can be computed directly via a heterogeneous neural network acceleration chip, ensuring a low latency characteristic suitable for edge computing. The three innovative works cover the three stages of training-authorization-inference of deep neural network models, which not only protect privacy features but also meet the performance requirements of edge computing for low latency and high accuracy, providing a new idea for privacy-sensitive edge AI research and application.