1, Privacy preserving oriented machine learning

English name :PPML:Privacy-Preserving Machine Learning


2, Privacy preserving oriented machine learning and secure machine learning

In machine learning , The adversary is supposed to violate the integrity and availability of the machine learning system .

stay PPML in , The adversary is supposed to violate the privacy and confidentiality of the machine learning system .

1. Integrity . Attacks on integrity may lead to detection errors in machine learning systems .

2. usability . The attack on usability may lead to classification errors .

3. Confidentiality . Attacks on confidentiality may lead to some sensitive information in machine learning systems ( Such as training data or training model ) There will be leaks .

surface . PPML A comparison between secure machine learning and secure machine learning

  Defense method of security threat attack mode


Refactoring attack

Model inversion attack

Member inference attack

secure multi-party computation

Homomorphic encryption

Differential privacy

Secure machine learning


Poison attack

Counter attack

Interrogation attack

Defensive distillation

Confrontation training


3. Threat and security model

In machine learning task , Participants usually play three different roles :

importer , For example, the original ownership of data .

Calculation side , Such as model builder and inference service provider .

Result side , Such as model inquirer and user .


Attacks on machine learning systems can occur at any stage , Including data release , Model training and model reasoning .

The attack in the model training phase is called refactoring attack (Reconstruction Attacks).

In the model reasoning stage , A hostile result party may use reverse engineering techniques to obtain additional information about the model , In order to implement the model inversion attack (Model Inversion
Attacks) Member inference attack (Membership-Inference Attacks).

Feature reasoning attack (Attribute-Inference Attacks) It occurs in the data release phase .

(1) Refactoring attack

The target of the adversary extracts training data during the training period of the model , Or extract the feature vector of training data . In the process of model training , Secure multiparty computation and homomorphic encryption can be used to protect the intermediate results against reconfiguration attacks . In model inference , The calculator should only be granted black box access to the model .

(2) Model inversion attack

The purpose of adversary is to extract training data or feature vectors from the model . Opponents with black box permissions may also attack by solving equations , Reconstructing the plaintext content of the model from the response . theoretically , For a N Linear model of dimension , An adversary can get through N+1 This is the first time to inquire i Penguin goes to the contents of the whole model . The enemy can also pass “ query - respond ” Process pairs are used to simulate a model similar to the original model .

There are several strategies to reduce the success rate of model inversion attack . as , Only the rounded predicted value is returned , The predicted category label is used as the return result , Returns the prediction results of aggregated multiple test samples , The Bayesian neural network with homomorphic encryption is also used to resist such attacks through the inference of secure neural network .

(3) Member inference attack

The target of the adversary is to judge whether the training set of the model contains specific samples . The adversary tries to judge whether the sample belongs to the training set of the model through the output of the machine learning model .

(4) Feature reasoning attack

The enemy is out of malice , De anonymize or lock the owner of the record . Before the data is released , By deleting the user's personally identifiable information ( It has also become a sensitive feature ) To achieve anonymity , It is a common method to protect user privacy . In order to deal with feature reasoning attack , Some literatures have proposed group anonymization privacy method , This kind of method is generalized (generalization) And inhibition (repression) Mechanism for privacy protection .


Attackers and security model

For cryptography PPML technology , It includes secure multiparty computation and homomorphic encryption , The existing research work involves two types of adversaries :

(1) Half honest (Semi-honest) Our adversary : The enemy abided by the agreement honestly , But they will also try to learn more from the received information in addition to the output of unexpected information .

(2) Malicious (Malicious) Adversary : The enemy did not comply with the agreement , Can perform any attack .

Most of them PPML The semi honest adversary model is considered . The main reason is that , stay FL in , It is in the interests of all parties to abide by the agreement semi honestly , Malicious behavior will also harm the interests of the enemy . Another reason is that , In cryptography , First of all, it is a standard method to establish a security protocol for semi honest adversary , Then it can be proved by zero knowledge (zero-knowledge-time) Strengthen it , And then defend against the attack of malicious adversaries .

For each security model , Opponents will attack some of the participants and corrupt them , Corrupt parties may collude with each other . The corruption of participants can be static , It can also be adaptive . The complexity of the adversary can be polynomial time or unbounded , They correspond to computing security and information theory security respectively . Security in cryptography is based on indistinguishability .

