Don’t Listen To Me: Understanding and Exploring Jailbreak Prompts of Large Language Models

Published in USENIX Security Symposium, 2024

This paper is about investigating LLM jailbreak threats, from the perspective of empirical evaluation and automatic generation.

Download paper here

Recommended citation: Z. Yu, X. Liu, S. Liang, Z. Cameron, C. Xiao, N. Zhang. Don’t Listen To Me: Understanding and Exploring Jailbreak Prompts of Large Language Models. In Proceedings of the 33rd USENIX Security Symposium 2024. USENIX Association.