Hot Topics in Security of Machine Learning

Overview

SemesterWinter 2023
Course typeBlock Seminar
LecturerJun.-Prof. Dr. Wressnegger
AudienceInformatik Master & Bachelor
Credits4 ECTS
Room148, Building 50.34
LanguageEnglish or German
Linkhttps://campus.kit.edu/campus/all/event.asp?gguid=0x0ADE8C79D0434919A68190ED644806BA
Registrationhttps://ilias.studium.kit.edu/goto.php?target=crs_2214302&client_id=produktiv

Description

This seminar is concerned with different aspects of adversarial machine learning. Next to the use of machine learning for security, also the security of machine learning algorithms is essential in practice. For a long time, machine learning has not considered worst-case scenarios and corner cases as those exploited by an adversarial nowadays.

The module introduces students to the recently extremely active field of attacks against machine learning and teaches them to work up results from recent research. To this end, the students will read up on a sub-field, prepare a seminar report, and present their work at the end of the term to their colleagues.

Topics include but are not limited to adversarial examples, model stealing, membership inferences, poisoning attacks, and defenses against such threats.

Schedule

DateStep
Mon, 23. Oct, 14:00–15:30Primer on academic writing, topic presentation
Wed, 25. Oct, 11:59Send topic selection
(assignment happens till 18:00)
Thu, 26. Oct, 11:59Officially register for assigned topic
(missed opportunities will be reassigned to waiting list till 18:00)
Thu, 2. NovArrange appointment with assistant
Mon, 6. Nov - Fri, 10. NovIndividual meeting (Provide first overview and ToC)
Thu, 4. JanSubmit final paper
Thu, 11. JanSubmit review for fellow students
Thu, 18. JanEnd of discussion phase
Fri, 19. JanNotification about paper acceptance/rejection
Fri, 26. JanSubmit camera-ready version of your paper
Fri, 16. FebPresentation at final colloquium

Matrix Chat

News about the seminar, potential updates to the schedule, and additional material are distributed using the course's matrix room. Moreover, matrix enables students to discuss topics and solution approaches.

You find the link to the matrix room on ILIAS.

Topics

Every student may choose one of the following topics. For each of these, we additionally provide a few recent top-tier publication that you should use as a starting point for your own research. For the seminar and your final report, you should not merely summarize that paper, but try to go beyond and arrive at your own conclusions.

Moreover, most of these papers come with open-source implementations. Play around with these and include the lessons learned in your report.

  • Backdooring Code-LLMs

    • "You Autocomplete Me: Poisoning Vulnerabilities in Neural Code Completion", USEINX Security 2021
    • "Backdooring Neural Code Search", ACL 2023

  • Security of Code Generated by LLMs

    • "Asleep at the Keyboard? Assessing the Security of GitHub Copilot's Code Contributions", IEEE S&P 2022
    • "Examining Zero-Shot Vulnerability Repair with Large Language Models", IEEE S&P 2023

  • Prompt Injection Attacks against LLMs

    • "Not what you've signed up for: Compromising Real-World LLM-Integrated Applications with Indirect Prompt Injection", CoRR 2023
    • "Ignore Previous Prompt: Attack Techniques For Language Models", ML Safety Workshop NeurIPS 2022

  • Automatic Prompt Injection Attacks against LLMs

    • "Universal and Transferable Adversarial Attacks on Aligned Language Models", CoRR 2023
    • "You Only Prompt Once: On the Capabilities of Prompt Learning on Large Language Models to Tackle Toxic Content", IEEE S&P 2024

  • Membership Inference Attacks against LLMs

    • "Extracting Training Data from Large Language Models", USENIX Security 2021
    • "Analyzing Leakage of Personally Identifiable Information in Language Models", IEEE S&P 2023

  • Memorization in LLMs

    • "Memorization Without Overfitting: Analyzing the Training Dynamics of Large Language Models", NeurIPS 2022
    • "Quantifying Memorization Across Neural Language Models", ICRL 2023

  • Explainability of Large Language Models

    • "Investigating Explainability of Generative AI for Code through Scenario-based Design", IUI 2022
    • "Who's Thinking? A Push for Human-Centered Evaluation of LLMs using the XAI Playbook", CHI Workshops 2023