Privacy-Preserving Aggregate Mobility Data Release: An Information-Theoretic Deep Reinforcement Learning Approach

Wenjing Zhang, Bo Jiang, Ming Li, Xiaodong Lin

Research output: Contribution to journalArticlepeer-review

1 Scopus citations

Abstract

It is crucial to protect users' location traces against inference attacks on aggregate mobility data collected from multiple users in various real-world applications. Most of the existing works on aggregate mobility data are focusing on inference attacks rather than designing privacy-preserving release mechanisms, and a few differential private release mechanisms suffer from poor utility-privacy tradeoffs. In this paper, we propose optimal centralized privacy-preserving aggregate mobility data release mechanisms (PAMDRMs) that minimize the leakage from an information-theoretic perspective by releasing perturbed versions of the raw aggregate location. Specifically, we use mutual information to measure user-level and aggregate-level privacy leakage separately, and formulate leakage minimization problems under utility constraints. As directly solving the optimization problems incur exponential complexity w.r.t. users' trace length, we transform them into belief state Markov Decision Processes (MDPs), with a focus on the MDP formulation for the user-level privacy problem. We build reinforcement learning (RL) models and leverage the efficient Asynchronous Advantage Actor-Critic RL algorithm to derive the solutions to the MDPs as our optimal PAMDRMs. We compare them with two state-of-the-art privacy protection mechanisms PDPR (context-aware local design) and DMLM (context-free centralized design) in terms of mutual information leakage and adversary's attack success (evaluated by her expected estimation error and Jensen-Shannon Divergence-based error). Extensive experimental results on both synthetic and real-world datasets demonstrate that the user-level PAMDRM performs the best on both measures thanks to its context-aware property and centralized design. Even though the aggregate-level PAMDRM achieves better privacy-utility tradeoff than the other two, it does not always perform better than them on adversarial success, highlighting the necessity of considering privacy measures from different perspectives to avoid overestimating the level of privacy offered to users. Lastly, we discuss an alternative, fully data-driven approach to derive the optimal PAMDRM by leveraging adversarial training on limited data samples.

Original languageEnglish (US)
Pages (from-to)849-864
Number of pages16
JournalIEEE Transactions on Information Forensics and Security
Volume17
DOIs
StatePublished - 2022

Keywords

  • Privacy metrics
  • aggregate mobility data
  • information-theoretic metrics
  • reinforcement learning approach

ASJC Scopus subject areas

  • Safety, Risk, Reliability and Quality
  • Computer Networks and Communications

Fingerprint

Dive into the research topics of 'Privacy-Preserving Aggregate Mobility Data Release: An Information-Theoretic Deep Reinforcement Learning Approach'. Together they form a unique fingerprint.

Cite this