TY - GEN
T1 - Fast, private and verifiable
T2 - 4th ACM International Workshop on Security in Cloud Computing, SCC 2016
AU - Qiu, Shuo
AU - Wang, Boyang
AU - Li, Ming
AU - Victors, Jesse
AU - Liu, Jiqiang
AU - Shi, Yanfeng
AU - Wang, Wei
N1 - Publisher Copyright:
© 2016 ACM.
PY - 2016/5/30
Y1 - 2016/5/30
N2 - Computing similarity, especially Jaccard Similarity, between two datasets is a fundamental building block in big data analytics, and extensive applications including genome matching, plagiarism detection, social networking, etc. The increasing user privacy concerns over the release of has sensitive data have made it desirable and necessary for two users to evaluate Jaccard Similarity over their datasets in a privacy-preserving manner. In this paper, we propose two efficient and secure protocols to compute the Jaccard Similarity of two users' private sets with the help of an unfully-trusted server. Specifically, in order to boost the efficiency, we leverage Minhashing algorithm on encrypted data, where the output of our protocols is guaranteed to be a close approximation of the exact value. In both protocols, only an approximate similarity result is leaked to the server and users. The first protocol is secure against a semi-honest server, while the second protocol, with a novel consistencycheck mechanism, further achieves result verifiability against a malicious server who cheats in the executions. Experimental results show that our first protocol computes an approximate Jaccard Similarity of two billion-element sets within only 6 minutes (under 256-bit security in parallel mode). To the best of our knowledge, our consistency-check mechanism represents the very first work to realize an efficient verification particularly on approximate similarity computation.
AB - Computing similarity, especially Jaccard Similarity, between two datasets is a fundamental building block in big data analytics, and extensive applications including genome matching, plagiarism detection, social networking, etc. The increasing user privacy concerns over the release of has sensitive data have made it desirable and necessary for two users to evaluate Jaccard Similarity over their datasets in a privacy-preserving manner. In this paper, we propose two efficient and secure protocols to compute the Jaccard Similarity of two users' private sets with the help of an unfully-trusted server. Specifically, in order to boost the efficiency, we leverage Minhashing algorithm on encrypted data, where the output of our protocols is guaranteed to be a close approximation of the exact value. In both protocols, only an approximate similarity result is leaked to the server and users. The first protocol is secure against a semi-honest server, while the second protocol, with a novel consistencycheck mechanism, further achieves result verifiability against a malicious server who cheats in the executions. Experimental results show that our first protocol computes an approximate Jaccard Similarity of two billion-element sets within only 6 minutes (under 256-bit security in parallel mode). To the best of our knowledge, our consistency-check mechanism represents the very first work to realize an efficient verification particularly on approximate similarity computation.
UR - http://www.scopus.com/inward/record.url?scp=84978405315&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84978405315&partnerID=8YFLogxK
U2 - 10.1145/2898445.2898453
DO - 10.1145/2898445.2898453
M3 - Conference contribution
AN - SCOPUS:84978405315
T3 - SCC 2016 - Proceedings of the 4th ACM International Workshop on Security in Cloud Computing, Co-located with Asia CCS 2016
SP - 29
EP - 36
BT - SCC 2016 - Proceedings of the 4th ACM International Workshop on Security in Cloud Computing, Co-located with Asia CCS 2016
PB - Association for Computing Machinery, Inc
Y2 - 30 May 2016
ER -