TY - JOUR
T1 - Differentially Private Community Detection for Stochastic Block Models
AU - Seif, Mohamed
AU - Nguyen, Dung
AU - Vullikanti, Anil
AU - Tandon, Ravi
N1 - Funding Information:
We thank the anonymous ICML reviewers for their insightful suggestions. The work of M. Seif and R. Tandon was supported by NSF grants CAREER 1651492, CNS 1715947, CCF 2100013 and CNS 2209951. The work of D. Nguyen and A. Vullikanti was partially supported by NSF grants CCF-1918656, IIS-1931628, IIS-1955797, and NIH grant R01GM109718.
Publisher Copyright:
Copyright © 2022 by the author(s)
PY - 2022
Y1 - 2022
N2 - The goal of community detection over graphs is to recover underlying labels/attributes of users (e.g., political affiliation) given the connectivity between users. There has been significant recent progress on understanding the fundamental limits of community detection when the graph is generated from a stochastic block model (SBM). Specifically, sharp information theoretic limits and efficient algorithms have been obtained for SBMs as a function of p and q, which represent the intra-community and inter-community connection probabilities. In this paper, we study the community detection problem while preserving the privacy of the individual connections between the vertices. Focusing on the notion of (ϵ, δ)-edge differential privacy (DP), we seek to understand the fundamental tradeoffs between (p, q), DP budget (ϵ, δ), and computational efficiency for exact recovery of community labels. To this end, we present and analyze the associated information-theoretic tradeoffs for three differentially private community recovery mechanisms: a) stability based mechanism; b) sampling based mechanisms; and c) graph perturbation mechanisms. Our main findings are that stability and sampling based mechanisms lead to a superior tradeoff between (p, q) and the privacy budget (ϵ, δ); however this comes at the expense of higher computational complexity. On the other hand, albeit low complexity, graph perturbation mechanisms require the privacy budget ϵ to scale as Ω(log(n)) for exact recovery.
AB - The goal of community detection over graphs is to recover underlying labels/attributes of users (e.g., political affiliation) given the connectivity between users. There has been significant recent progress on understanding the fundamental limits of community detection when the graph is generated from a stochastic block model (SBM). Specifically, sharp information theoretic limits and efficient algorithms have been obtained for SBMs as a function of p and q, which represent the intra-community and inter-community connection probabilities. In this paper, we study the community detection problem while preserving the privacy of the individual connections between the vertices. Focusing on the notion of (ϵ, δ)-edge differential privacy (DP), we seek to understand the fundamental tradeoffs between (p, q), DP budget (ϵ, δ), and computational efficiency for exact recovery of community labels. To this end, we present and analyze the associated information-theoretic tradeoffs for three differentially private community recovery mechanisms: a) stability based mechanism; b) sampling based mechanisms; and c) graph perturbation mechanisms. Our main findings are that stability and sampling based mechanisms lead to a superior tradeoff between (p, q) and the privacy budget (ϵ, δ); however this comes at the expense of higher computational complexity. On the other hand, albeit low complexity, graph perturbation mechanisms require the privacy budget ϵ to scale as Ω(log(n)) for exact recovery.
UR - http://www.scopus.com/inward/record.url?scp=85146907945&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85146907945&partnerID=8YFLogxK
M3 - Conference article
AN - SCOPUS:85146907945
SN - 2640-3498
VL - 162
SP - 15858
EP - 15894
JO - Proceedings of Machine Learning Research
JF - Proceedings of Machine Learning Research
T2 - 39th International Conference on Machine Learning, ICML 2022
Y2 - 17 July 2022 through 23 July 2022
ER -