On Rényi Differential Privacy in Statistics-based Synthetic Data Generation

この論文をさがす

抄録

Privacy protection with synthetic data generation often uses differentially private statistics and model parameters to quantitatively express theoretical security. However, these methods do not take into account privacy protection due to the randomness of data generation. In this paper, we theoretically evaluate Rényi differential privacy of the randomness in data generation of a synthetic data generation method that uses the mean vector and the covariance matrix of an original dataset. Specifically, for a fixed α > 1, we show the condition of ε such that the synthetic data generation satisfies (α, ε)-Rényi differential privacy under a bounded neighboring condition and an unbounded neighboring condition, respectively. In particular, under the unbounded condition, when the size of the original dataset and synthetic dataset is 10 million, the mechanism satisfies (4, 0.576)-Rényi differential privacy. We also show that when we translate it into the traditional (ε, δ)-differential privacy, the mechanism satisfies (4.46, 10-14)-differential privacy.------------------------------This is a preprint of an article intended for publication Journal ofInformation Processing(JIP). This preprint should not be cited. Thisarticle should be cited as: Journal of Information Processing Vol.31(2023) (online)DOI http://dx.doi.org/10.2197/ipsjjip.31.812------------------------------

Privacy protection with synthetic data generation often uses differentially private statistics and model parameters to quantitatively express theoretical security. However, these methods do not take into account privacy protection due to the randomness of data generation. In this paper, we theoretically evaluate Rényi differential privacy of the randomness in data generation of a synthetic data generation method that uses the mean vector and the covariance matrix of an original dataset. Specifically, for a fixed α > 1, we show the condition of ε such that the synthetic data generation satisfies (α, ε)-Rényi differential privacy under a bounded neighboring condition and an unbounded neighboring condition, respectively. In particular, under the unbounded condition, when the size of the original dataset and synthetic dataset is 10 million, the mechanism satisfies (4, 0.576)-Rényi differential privacy. We also show that when we translate it into the traditional (ε, δ)-differential privacy, the mechanism satisfies (4.46, 10-14)-differential privacy.------------------------------This is a preprint of an article intended for publication Journal ofInformation Processing(JIP). This preprint should not be cited. Thisarticle should be cited as: Journal of Information Processing Vol.31(2023) (online)DOI http://dx.doi.org/10.2197/ipsjjip.31.812------------------------------

収録刊行物

詳細情報 詳細情報について

問題の指摘

ページトップへ