AIWiki
Malaysia

Differential Privacy

Differential privacy is a mathematical framework for analysing data that guarantees the output of a computation reveals little about any single individual, achieved by adding calibrated random noise to limit each record's influence.

5 min readLast updated June 2026Foundations

Differential privacy is a mathematical framework that provides a formal, provable guarantee about how much a data analysis can reveal about any single individual. Introduced in 2006 by Cynthia Dwork and colleagues, it has become the leading standard for privacy-preserving data analysis and is increasingly applied to machine learning. The core idea is that the result of a computation should be essentially the same whether or not any one person's data is included, so that an observer cannot confidently determine whether a specific individual contributed to the dataset.

This guarantee is achieved by adding carefully calibrated random noise to a computation: to the data, to intermediate values, or to the final output. The amount of noise is tuned so that the influence of any single record is bounded. The strength of the guarantee is controlled by a parameter called epsilon, sometimes called the privacy budget. A smaller epsilon means stronger privacy but more noise and therefore lower accuracy, while a larger epsilon allows more accurate results at the cost of weaker privacy. Choosing an appropriate value is the central trade-off in any deployment.

Why it matters for machine learning

Machine-learning models can unintentionally memorise details of their training data, which creates several privacy risks. Differential privacy is designed to defend against these attacks. Membership inference attacks attempt to determine whether a particular person's record was used in training. Attribute inference attacks try to learn sensitive attributes of individuals in the training set. Model inversion attacks attempt to reconstruct portions of the training data from the model itself. A model trained with differential privacy limits all of these because no single training example can substantially change the learned parameters.

DP-SGD and practical methods

The most widely used technique for training neural networks with privacy guarantees is differentially private stochastic gradient descent, or DP-SGD. It modifies ordinary training in two ways: it clips the gradient computed from each example so that no single record can have an outsized effect, and it adds random noise to the aggregated gradients before updating the model. Over the course of training, the cumulative privacy loss is tracked using privacy-accounting methods that compose the small losses from each step into an overall budget. Research has shown that DP-SGD can retain clinically acceptable accuracy under moderate privacy budgets, particularly in medical imaging, though tuning remains challenging.

Deployment in practice

Differential privacy is no longer purely theoretical. Apple uses it to collect usage statistics across devices, Google applies it in products and has released open tooling, and the United States Census Bureau adopted it to protect respondents in the 2020 census. It is also commonly combined with federated learning, where models are trained across many devices without centralising raw data, to provide a stronger overall privacy posture.

| Concept | Meaning | |---------|---------| | Epsilon | Privacy budget; lower is more private | | DP-SGD | Private training via gradient clipping and noise | | Privacy accounting | Tracking cumulative privacy loss | | Local vs central DP | Noise added on-device vs by a trusted curator |

References

  1. Dwork, C., McSherry, F., Nissim, K., & Smith, A. (2006). Calibrating Noise to Sensitivity in Private Data Analysis. TCC.
  2. Abadi, M., et al. (2016). Deep Learning with Differential Privacy. ACM CCS.
  3. arXiv. (2025). Differential Privacy in Machine Learning: A Survey from Symbolic AI to LLMs. arxiv.org/abs/2506.11687.
  4. Department of Personal Data Protection Malaysia. (2024). Personal Data Protection Act 2010 and 2024 Amendments. pdp.gov.my.