Paper 2023/1678
BumbleBee: Secure Two-party Inference Framework for Large Transformers
Abstract
Large transformer-based models have realized state- of-the-art performance on lots of real-world tasks such as natural language processing and computer vision. However, with the increasing sensitivity of the data and tasks they handle, privacy has become a major concern during model deployment. In this work, we focus on private inference in two-party settings, where one party holds private inputs and the other holds the model. We introduce BumbleBee, a fast and communication-friendly two-party private transformer inference system. Our contributions are three-fold: Firstly, we present optimized homomorphic encryption-based proto- cols that enable the multiplication of large matrices with 80 – 90% less communication cost than existing methods. Secondly, we offer a general method for designing efficient and accurate protocols for non-linear activation functions in transformers. Our activation protocols have demonstrated speed and reduced the communication overhead by 80 – 95% over two existing methods. Finally, we conducted intensive benchmarks on several large transformer models. Results show that BumbleBee is more than one order of magnitude faster than Iron (NeurIPS22).
Metadata
- Available format(s)
- Category
- Cryptographic protocols
- Publication info
- Preprint.
- Keywords
- secure neural inferencesecure two-party computationprivacy-preserving machine learning
- Contact author(s)
- juhou lwj @ antgroup com
- History
- 2023-10-31: revised
- 2023-10-30: received
- See all versions
- Short URL
- https://ia.cr/2023/1678
- License
-
CC BY-NC
BibTeX
@misc{cryptoeprint:2023/1678, author = {Wen-jie Lu and Zhicong Huang and Zhen Gu and Jingyu Li and Jian Liu and Kui Ren and Cheng Hong and Tao Wei and WenGuang Chen}, title = {BumbleBee: Secure Two-party Inference Framework for Large Transformers}, howpublished = {Cryptology ePrint Archive, Paper 2023/1678}, year = {2023}, note = {\url{https://eprint.iacr.org/2023/1678}}, url = {https://eprint.iacr.org/2023/1678} }