language-icon Old Web
English
Sign In

Differential privacy

Differential privacy is a constraint on the algorithms used to publish aggregate information about a statistical database which limits the disclosure of private information of records whose information is in the database. For example, differentially private algorithms are used by some government agencies to publish demographic information or other statistical aggregates while ensuring confidentiality of survey responses, and by companies to collect information about user behavior while controlling what is visible even to internal analysts. Pr [ A ( D 1 ) ∈ S ] ≤ e ϵ × Pr [ A ( D 2 ) ∈ S ] , {displaystyle Prleq e^{epsilon } imes Pr,} Cite error: A list-defined reference named 'CABP13' is not used in the content (see the help page).Cite error: A list-defined reference named 'HRW11' is not used in the content (see the help page). Differential privacy is a constraint on the algorithms used to publish aggregate information about a statistical database which limits the disclosure of private information of records whose information is in the database. For example, differentially private algorithms are used by some government agencies to publish demographic information or other statistical aggregates while ensuring confidentiality of survey responses, and by companies to collect information about user behavior while controlling what is visible even to internal analysts. Roughly, an algorithm is differentially private if an observer seeing its output cannot tell if a particular individual's information was used in the computation. Differential privacy is often discussed in the context of identifying individuals whose information may be in a database. Although it does not directly refer to identification and reidentification attacks, differentially private algorithms probably resist such attacks. Differential privacy was developed by cryptographers and is thus often associated with cryptography, and it draws much of its language from cryptography. Official statistics organizations are charged with collecting information from individuals or establishments and publishing aggregate data to serve the public interest. For example, the 1790 United States Census collected information about individuals living in the United States and published tabulations based on sex, age, race, and condition of servitude. Statistical organizations have long collected information under a promise of confidentiality that the information provided will be used for statistical purposes, but that the publications will not produce information that can be traced back to a specific individual or establishment. To accomplish this goal, statistical organizations have long suppressed information in their publications. For example, in a table presenting the sales of each business in a town grouped by business category, a cell that has information from only one company might be suppressed, in order to maintain the confidentiality of that company's specific sales.

[ "Algorithm", "Computer security", "Statistics", "Data mining", "database privacy" ]
Parent Topic
Child Topic
    No Parent Topic