Paper 2019/466

Privacy-Preserving K-means Clustering with Multiple Data Owners

Jung Hee Cheon, Jinhyuck Jeong, Dohyeong Ki, Jiseung Kim, Joohee Lee, and Seok Won Lee

Abstract

Recently with the advent of technology, a lot of data are stored and mined in cloud servers. Since most of the data contain potential private information, it has become necessary to preserve the privacy in data mining. In this paper, we propose a protocol for collaboratively performing the K-means clustering algorithm on the data distributed among multiple data owners, while protecting the sensitive private data. We employ two service providers in our scenario, namely a main service provider and a key manager. Under the assumption that the cryptosystems used in our protocol are secure and that the two service providers do not collude, we provide a perfect secrecy in the sense that the cluster centroids and data are not leaked to any party including the two service providers. Also, we implement the scenario using recently proposed leveled homomorphic encryption called HEAAN. With our construction, the privacy-preserving K-means clustering can be done in less than one minute while maintaining 80-bit security in a situation with 10,000 data, 8 features and 4 clusters.

Note: This work was conducted in the fall of 2017.

Metadata
Available format(s)
-- withdrawn --
Category
Applications
Publication info
Preprint. MINOR revision.
Keywords
K-means ClusteringClusteringMachine LearningPrivacy-PreservingFully Homomorphic EncryptionHEAAN
Contact author(s)
wooki7098 @ snu ac kr
History
2019-05-10: withdrawn
2019-05-10: received
See all versions
Short URL
https://ia.cr/2019/466
License
Creative Commons Attribution
CC BY
Note: In order to protect the privacy of readers, eprint.iacr.org does not use cookies or embedded third party content.