Semester | Winter 2021 |
Course type | Block Seminar |
Lecturer | TT.-Prof. Dr. Wressnegger |
Audience | Informatik Master & Bachelor |
Credits | 4 ECTS |
Room | 148, Building 50.34 and online |
Language | English or German |
Link | https://ilias.studium.kit.edu/goto_produktiv_crs_1600112.html |
Registration | Please register for the course in ILIAS |
Due to the ongoing COVID-19 pandemic, this course is going to start off remotely, meaning, the kick-off meeting will happen online. The final colloquium, however, will hopefully be an in-person meeting again (<- This time we might indeed have a chance).
To receive all the necessary information, please subscribe to the mailing list here.
This seminar is concerned with different aspects of adversarial machine learning. Next to the use of machine learning for security, also the security of machine learning algorithms is essential in practice. For a long time, machine learning has not considered worst-case scenarios and corner cases as those exploited by an adversarial nowadays.
The module introduces students to the recently extremely active field of attacks against machine learning and teaches them to work up results from recent research. To this end, the students will read up on a sub-field, prepare a seminar report, and present their work at the end of the term to their colleagues.
Topics include but are not limited to adversarial examples, model stealing, membership inferences, poisoning attacks, and defenses against such threats.
Date | Step |
Tue, 19. Oct, 10:00–11:30 | Primer on academic writing, assignment of topics |
Thu, 28. Oct | Arrange appointment with assistant |
Mo, 1. Nov - Fr, 5. Nov | Individual meetings with assistant |
Wed, 1. Dec | Submit final paper |
Wed, 22. Dec | Submit review for fellow students |
Fri, 7. Jan | End of discussion phase |
Fri, 21. Jan | Submit camera-ready version of your paper |
Fri, 11. Feb | Presentation at final colloquium |
News about the seminar, potential updates to the schedule, and additional material are distributed using a separate mailing list. Moreover, the list enables students to discuss topics of the seminar.
You can subscribe here.
Every student may choose one of the following topics. For each of these, we additionally provide a few recent top-tier publication that you should use as a starting point for your own research. For the seminar and your final report, you should not merely summarize that paper, but try to go beyond and arrive at your own conclusions.
Moreover, most of these papers come with open-source implementations. Play around with these and include the lessons learned in your report.
In a black-box setting (the attack has no access to the ML model), the adversary usually learns a substitute/surrogate model to craft adversarial examples. However, the lack of authentic training data degrades the efficiency of such attacks. Recently a novel attack has been presented that even works without any real data. This seminar topic aims to investigate the possibility of such data-free adversarial attacks.
Adversarial training increases a model's robustness by introducing adversarial examples in the training procedure. Recent research suggests that the effectiveness of adversarial examples against a model is linked to its linearity. Therefore, increasing non-linearity may help improve model robustness. For this seminar topic, the student will investigate the impact of non-linearity on the model's robustness.
The impact of adversarial perturbation is accumulated through different layers of the ML model up until it subverts the final prediction result. Breaking the connection between layers can reduce the adversarial influence on activation maps, allowing for a novel technique for protecting against attacks. For this topic, the student studies how network reconstruction helps to improve a model's robustness.
Backdoor attacks and evasion attacks are two fundamental branches in the field of adversarial machine learning. Backdoors are introduced during training already, while evasion attacks occur later on during inference. Both, however, share similarities that should be explored as part of this seminar topic:
Federated learning aggregates updates from local participants to learn a common/shared remote model, ensuring the confidentiality of a user's training data on a high level. At the same time, the unobservable local data carries hidden threats when the user tries to inject backdoored data into the training process. Defenses against such attacks thus are essential and urgently needed.
To evaluated adversarial vulnerability in detail, different works propose to use l0, l1, or l2 norm-based distances of the activation map as the criterion. The activation map, however, cannot comprehensively reflect the model prediction behavior regarding the inputs. Recent works try to connect explainable AI with adversarial robustness: Explainability methods such as IG and LRP strongly rely on gradients with similar motivation as in adversarial evasion attacks.
Model extraction attacks can be used to "steal" an approximation of an MLaaS ("machine learning as a service") model by querying the model in a black-box fashion. This approximation (substitute/surrogate model) can then be used to generate adversarial examples against the original remotely deployed model. This seminar topic focuses on systematizing (assumptions, pros, cons, application scenarios) state-of-art model extraction attacks and corresponding defense methods.
Effective model extraction remains a challenge for various reasons, such as the limited query budget or the lack of knowledge about the model's architecture. Recent research considers model extractions as a process of active learning (AL) where the remote model is treated as an oracle in the AL setting. It is shown that AL strategies benefit effective model extraction by reducing the query budget significantly. Hence, in this seminar report, the student focuses on state-of-art model extractions attacks that use learning technologies, such as active learning or transfer learning.