越狱论文 - 专知

会员服务 ·

Iterative Self-Tuning LLMs for Enhanced Jailbreaking Capabilities

Arxiv

0+阅读 · 11月7日

Exploiting Latent Space Discontinuities for Building Universal LLM Jailbreaks and Data Extraction Attacks

Arxiv

0+阅读 · 11月1日

DefenSee: Dissecting Threat from Sight and Text - A Multi-View Defensive Pipeline for Multi-modal Jailbreaks

Arxiv

0+阅读 · 12月1日

Retrieval-Augmented Defense: Adaptive and Controllable Jailbreak Prevention for Large Language Models

Arxiv

0+阅读 · 11月3日

Immunity memory-based jailbreak detection: multi-agent adaptive guard for large language models

Arxiv

0+阅读 · 12月3日

Beyond Model Jailbreak: Systematic Dissection of the "Ten DeadlySins" in Embodied Intelligence

Arxiv

0+阅读 · 12月6日

VERA: Variational Inference Framework for Jailbreaking Large Language Models

Arxiv

0+阅读 · 11月6日

Beyond Fixed and Dynamic Prompts: Embedded Jailbreak Templates for Advancing LLM Security

Arxiv

0+阅读 · 11月18日

A Practical Framework for Evaluating Medical AI Security: Reproducible Assessment of Jailbreaking and Privacy Vulnerabilities Across Clinical Specialties

Arxiv

0+阅读 · 12月9日

Chain-of-Lure: A Universal Jailbreak Attack Framework using Unconstrained Synthetic Narratives

Arxiv

0+阅读 · 11月13日

Evolve the Method, Not the Prompts: Evolutionary Synthesis of Jailbreak Attacks on LLMs

Arxiv

0+阅读 · 11月16日

Jailbreaking in the Haystack

Arxiv

0+阅读 · 11月5日

参考链接

微信扫码咨询专知VIP会员