登录 EN

添加临时用户

机器学习云服务中的数据安全保护关键技术研究

Enhancing Data Security in Machine-Learning-as-a-Service

作者:李贇
  • 学号
    2019******
  • 学位
    博士
  • 电子邮箱
    liy******.cn
  • 答辩日期
    2024.05.23
  • 导师
    张超
  • 学科名
    网络空间安全
  • 页码
    144
  • 保密级别
    公开
  • 培养单位
    412 网络研究院
  • 中文关键词
    机器学习;云计算;数据安全;安全多方计算;零知识证明
  • 英文关键词
    Machine Learning; Cloud Computing; Data Security; Secure Multi-Party Computation; Zero-knowledge Proofs

摘要

近年来,机器学习技术迅猛发展,相应的机器学习云服务也应运而生。然而, 用户数据在云端进行存储和处理的过程中面临着一些安全隐患:在数据上传阶段,用户的隐私信息可能被泄露;在结果返回阶段,云服务器返回的数据不一定真实有效。为了消解这两方面的安全隐患,必须对数据隐私性加以保护并对计算完整性加以验证。为此,本文针对保护数据隐私性和验证计算完整性的两项关键技术——安全多方计算和零知识证明展开研究。现有研究工作所采用的安全多方计算协议计算开销高、通信量大,而零知识证明方案初始设置开销高、证明效率低且灵活性差。针对这样的研究现状,本文设计了新的协议对其进行改进,主要贡献如下:1. 本文提出了一种针对布尔电路的高效安全多方计算协议,旨在保护机器学 习云服务中布尔计算部分数据的隐私性。本协议将布尔电路上执行的二进制运算转化到素数域上,并利用素数域的代数结构进一步加速,在取得亚线性通信复杂度的同时降低了计算开销。实验结果显示,本协议的计算速度是前人工作的3.5倍,同时具有更低的通信开销,在将其应用于神经网络的布尔电路部分时,取得恶意模型安全性的时间开销降低了67%以上。2. 本文提出了一种针对环上算术电路的高效安全多方计算协议,旨在保护机 器学习云服务中算术计算部分数据的隐私性。本协议在更大的环上验证原始环上乘法运算的正确性,避免了使用昂贵的环扩张,并利用递归的思想逐步缩减待验证等式的长度,取得了亚线性复杂度的通信开销。实验结果显示,相比于前人工作,本协议可取得最优的通信开销,同时可将计算速度提升至其47倍。3. 本文提出了一种高效零知识证明方案,旨在验证机器学习云服务中的计算 完整性。本证明系统基于Pedersen承诺方案,针对数据并行电路进行优化,极大简化了设置过程、提高了证明效率,且为方案引入了Commit-and-Prove性质,使整个协议更具灵活性和扩展性。在神经网络模型上的实验结果显示,相比于前人工作,本方案可将参数数量从420GB减少到1MB,设置时间从2天减少到0.03s,证明时间从两天减少到约6分钟,速度提升了两个量级。总而言之,本文从机器学习云服务中的数据隐私性保护和计算完整性验证出 发,为机器学习云服务中的数据安全设计了更高效的保护方案,为机器学习云服 务的安全发展和部署提供了更可靠的技术支撑。

With machine learning developing rapidly, related cloud services have emerged. However, storing and processing users’ data in cloud servers introduces some security risks: users’ privacy may be leaked, and the returned results may be invalid. To mitigate these problems, it is imperative to protect data privacy and verify computational integrity. To this end, this thesis studies two key techniques —- secure multi-party computation and zero-knowledge proofs. In existing research, secure multi-party computation protocols have expensive computational and communication overhead, and zero-knowledge proofs have high setup cost, low prover e?iciency, and limited flexibility. This thesis designs new e?icient schemes to improve upon them, making the following contributions:1. An E?icient Secure Multiparty Computation Protocol for Boolean Circuits: The new protocol converts the verification of binary operations onto prime fields, and takes advantage of the special structure of prime fields to further reduce computational overhead. Compared to previous works, it is 3.5 times faster in running time and at the same time achieves lower communication overhead.2. An E?icient Secure Multiparty Computation Protocol for Arithmetic Circuits: The new protocol uses a larger ring to verify operations on a smaller ring to avoid using expensive ring extensions, and applies the recursive paradigm to achieve sublinear communication complexity. Compared to previous methods, the protocol is 47 times faster with the optimal communication overhead.3. An E?icient Zero-Knowledge Proof System: The new proof system is based on the Pedersen commitment scheme, and optimized for data-parallel circuits. It greatly simplifies the setup phase, improves prover e?iciency, and enhances the flexibility of the whole system. Experiments demonstrate that it can reduce setup time from 2 days to 0.03 s, and proving time from 2 days to 6 minutes.In summary, this thesis designs e?icient solutions for protecting data privacy and verifying computational integrity in machine-learning-as-a-service, providing reliable technical support for its secure development in practice.