TY - GEN
T1 - Data inference from encrypted databases
T2 - 21st ACM International Symposium on Mobile Ad Hoc Networking and Computing, MobiHoc 2020
AU - Pan, Yanjun
AU - Efrat, Alon
AU - Li, Ming
AU - Wang, Boyang
AU - Quan, Hanyu
AU - Mitchell, Joseph
AU - Gao, Jie
AU - Arkin, Esther
N1 - Funding Information:
Work on this paper by M. Li has been partially supported by the National Science Foundation (CNS-1731164). B. Wang is partially supported by National Science Foundation (CNS-1947913). H. Quan is partially supported by the Scientific Research Funds of Huaqiao University (605-50Y19028). E. Arkin and J. Mitchell are partially supported by the National Science Foundation (CCF-2007275, CCF-1526406). J. Mitchell acknowledges support from the US-Israel Binational Science Foundation (project 2016116), Sandia National Labs, and DARPA (Lagrange). J. Gao acknowledges supports from NSF CNS-1618391, DMS-1737812, OAC-1939459.
Publisher Copyright:
© 2020 ACM.
PY - 2020/10/11
Y1 - 2020/10/11
N2 - Due to increasing concerns of data privacy, databases are being encrypted before they are stored on an untrusted server. To enable search operations on the encrypted data, searchable encryption techniques have been proposed. Representative schemes use order-preserving encryption (OPE) for supporting efficient Boolean queries on encrypted databases. Yet, recent works showed the possibility of inferring plaintext data from OPE-encrypted databases, merely using the order-preserving constraints, or combined with an auxiliary plaintext dataset with similar frequency distribution. So far, the effectiveness of such attacks is limited to single-dimensional dense data (most values from the domain are encrypted), but it remains challenging to achieve it on high-dimensional datasets (e.g., spatial data), which are often sparse in nature. In this paper, for the first time, we study data inference attacks on multi-dimensional encrypted databases (with 2-D as a special case). We formulate it as a 2-D order-preserving matching problem and explore both unweighted and weighted cases, where the former maximizes the number of points matched using only order information and the latter further considers points with similar frequencies. We prove that the problem is NP-hard, and then propose a greedy algorithm, along with a polynomial-time algorithm with approximation guarantees. Experimental results on synthetic and real-world datasets show that the data recovery rate is significantly enhanced compared with the previous 1-D matching algorithm.
AB - Due to increasing concerns of data privacy, databases are being encrypted before they are stored on an untrusted server. To enable search operations on the encrypted data, searchable encryption techniques have been proposed. Representative schemes use order-preserving encryption (OPE) for supporting efficient Boolean queries on encrypted databases. Yet, recent works showed the possibility of inferring plaintext data from OPE-encrypted databases, merely using the order-preserving constraints, or combined with an auxiliary plaintext dataset with similar frequency distribution. So far, the effectiveness of such attacks is limited to single-dimensional dense data (most values from the domain are encrypted), but it remains challenging to achieve it on high-dimensional datasets (e.g., spatial data), which are often sparse in nature. In this paper, for the first time, we study data inference attacks on multi-dimensional encrypted databases (with 2-D as a special case). We formulate it as a 2-D order-preserving matching problem and explore both unweighted and weighted cases, where the former maximizes the number of points matched using only order information and the latter further considers points with similar frequencies. We prove that the problem is NP-hard, and then propose a greedy algorithm, along with a polynomial-time algorithm with approximation guarantees. Experimental results on synthetic and real-world datasets show that the data recovery rate is significantly enhanced compared with the previous 1-D matching algorithm.
KW - data inference
KW - encrypted database
KW - multi-dimensional matching
KW - order-preserving encryption
UR - http://www.scopus.com/inward/record.url?scp=85093965734&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85093965734&partnerID=8YFLogxK
U2 - 10.1145/3397166.3409135
DO - 10.1145/3397166.3409135
M3 - Conference contribution
AN - SCOPUS:85093965734
T3 - Proceedings of the International Symposium on Mobile Ad Hoc Networking and Computing (MobiHoc)
SP - 151
EP - 160
BT - MobiHoc 2020 - Proceedings of the 2020 International Symposium on Theory, Algorithmic Foundations, and Protocol Design for Mobile Networks and Mobile Computing
PB - Association for Computing Machinery
Y2 - 11 October 2020 through 14 October 2020
ER -