Adversarial Machine Learning

Overview

Semester	Winter 2021
Course type	Block Seminar
Lecturer	TT.-Prof. Dr. Wressnegger
Audience	Informatik Master & Bachelor
Credits	4 ECTS
Room	148, Building 50.34 and online
Language	English or German
Link	https://ilias.studium.kit.edu/goto_produktiv_crs_1600112.html
Registration	Please register for the course in ILIAS

Remote Course

Due to the ongoing COVID-19 pandemic, this course is going to start off remotely, meaning, the kick-off meeting will happen online. The final colloquium, however, will hopefully be an in-person meeting again (<- This time we might indeed have a chance).

To receive all the necessary information, please subscribe to the mailing list here.

Description

This seminar is concerned with different aspects of adversarial machine learning. Next to the use of machine learning for security, also the security of machine learning algorithms is essential in practice. For a long time, machine learning has not considered worst-case scenarios and corner cases as those exploited by an adversarial nowadays.

The module introduces students to the recently extremely active field of attacks against machine learning and teaches them to work up results from recent research. To this end, the students will read up on a sub-field, prepare a seminar report, and present their work at the end of the term to their colleagues.

Topics include but are not limited to adversarial examples, model stealing, membership inferences, poisoning attacks, and defenses against such threats.

Schedule

Date	Step
Tue, 19. Oct, 10:00–11:30	Primer on academic writing, assignment of topics
Thu, 28. Oct	Arrange appointment with assistant
Mo, 1. Nov - Fr, 5. Nov	Individual meetings with assistant
Wed, 1. Dec	Submit final paper
Wed, 22. Dec	Submit review for fellow students
Fri, 7. Jan	End of discussion phase
Fri, 21. Jan	Submit camera-ready version of your paper
Fri, 11. Feb	Presentation at final colloquium

Mailing List

News about the seminar, potential updates to the schedule, and additional material are distributed using a separate mailing list. Moreover, the list enables students to discuss topics of the seminar.

You can subscribe here.

Topics

Every student may choose one of the following topics. For each of these, we additionally provide a few recent top-tier publication that you should use as a starting point for your own research. For the seminar and your final report, you should not merely summarize that paper, but try to go beyond and arrive at your own conclusions.

Moreover, most of these papers come with open-source implementations. Play around with these and include the lessons learned in your report.

Data-Free Adversarial Attacks
In a black-box setting (the attack has no access to the ML model), the adversary usually learns a substitute/surrogate model to craft adversarial examples. However, the lack of authentic training data degrades the efficiency of such attacks. Recently a novel attack has been presented that even works without any real data. This seminar topic aims to investigate the possibility of such data-free adversarial attacks.
- Zhou et al., "DaST: Data-free Substitute Training for Adversarial Attacks", CVPR 2020
Non-linearity Helps Robustness
Adversarial training increases a model's robustness by introducing adversarial examples in the training procedure. Recent research suggests that the effectiveness of adversarial examples against a model is linked to its linearity. Therefore, increasing non-linearity may help improve model robustness. For this seminar topic, the student will investigate the impact of non-linearity on the model's robustness.
- Cohen et al., "Certified Adversarial Robustness via Randomized Smoothing", ICML 2019
Increasing Robustness by Activation Suppression
The impact of adversarial perturbation is accumulated through different layers of the ML model up until it subverts the final prediction result. Breaking the connection between layers can reduce the adversarial influence on activation maps, allowing for a novel technique for protecting against attacks. For this topic, the student studies how network reconstruction helps to improve a model's robustness.
- Bai et al., "Improving Adversarial Robustness via Channel-wise Activation Suppressing", ICLR 2021
Trade-off between Backdoor and Adversarial Attack
Backdoor attacks and evasion attacks are two fundamental branches in the field of adversarial machine learning. Backdoors are introduced during training already, while evasion attacks occur later on during inference. Both, however, share similarities that should be explored as part of this seminar topic:
- Weng et al., "On the Trade-off between Adversarial and Backdoor Robustness", NeurIPS 2020
Backdoors in Federated Learning
Federated learning aggregates updates from local participants to learn a common/shared remote model, ensuring the confidentiality of a user's training data on a high level. At the same time, the unobservable local data carries hidden threats when the user tries to inject backdoored data into the training process. Defenses against such attacks thus are essential and urgently needed.
- Bagdasaryan et al., "How To Backdoor Federated Learning", AISTATS 2020
Improving NN Robustness by Explainability
To evaluated adversarial vulnerability in detail, different works propose to use l0, l1, or l2 norm-based distances of the activation map as the criterion. The activation map, however, cannot comprehensively reflect the model prediction behavior regarding the inputs. Recent works try to connect explainable AI with adversarial robustness: Explainability methods such as IG and LRP strongly rely on gradients with similar motivation as in adversarial evasion attacks.
- Boopathy et al., "Proper Network Interpretability Helps Adversarial Robustness in Classification", ICML 2020
Stealing DNNs from MlaaS Cloud Platforms?
Model extraction attacks can be used to "steal" an approximation of an MLaaS ("machine learning as a service") model by querying the model in a black-box fashion. This approximation (substitute/surrogate model) can then be used to generate adversarial examples against the original remotely deployed model. This seminar topic focuses on systematizing (assumptions, pros, cons, application scenarios) state-of-art model extraction attacks and corresponding defense methods.
- Kariyappa et al., "Maze: Data-free Model Stealing Attack using Zeroth-Order Gradient Estimation", CVPR 2021
Less Budget, More Effective Model Stealing with Active Learning
Effective model extraction remains a challenge for various reasons, such as the limited query budget or the lack of knowledge about the model's architecture. Recent research considers model extractions as a process of active learning (AL) where the remote model is treated as an oracle in the AL setting. It is shown that AL strategies benefit effective model extraction by reducing the query budget significantly. Hence, in this seminar report, the student focuses on state-of-art model extractions attacks that use learning technologies, such as active learning or transfer learning.
- Chandrasekaran et al., "Exploring Connections Between Active Learning and Model Extraction", USENIX Security 2020