Common Component in Black-Boxes Is Prone to Attacks

Jiyi Zhang,Wesley Joon-Wie Tann,Ee-Chien Chang,Hwee Kuan Lee

Common Component in Black-Boxes Is Prone to Attacks

2021

Neural network models are getting increasingly complex. Large models are often modular, consisting of multiple separate sharable components. The development of such components may require specific domain knowledge, intensive computation power, and large datasets. Therefore, there is a high incentive for companies to keep these components proprietary. However, when a common component is included in multiple black-box models, it could potentially provide another attack vector and weaken security. In this paper, we present a method that “extracts” the common component from black-box models, using only limited resources. With a small number of data samples, an attacker can (1) obtain accurate information about the shared component, stealing propriety information of the intellectual property, and (2) utilize this component to train new tasks or execute subsequent attacks such as model cloning, class inversion, and adversarial attacks more effectively. Comprehensive experiments demonstrate that our proposed method successfully extracts the common component through hard-label and black-box access only. Moreover, the consequent attacks are also effective against straightforward defenses that introduce noise and dummy classifiers.

Keywords:

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations