unSafe.sh - 不安全
我的收藏
今日热榜
公众号文章
导航
Github CVE
Github Tools
编码/解码
文件传输
Twitter Bot
Telegram Bot
Search
Rss
黑夜模式
SelfDefend:一种实用的大语言模型越狱攻击防御框架
好的,我现在要帮用户总结这篇文章的内容。用户的要求是用中文,控制在100字以内,不需要特定的开头,直接描述文章内容。 首先,我需要通读文章摘要。这篇文章讨论的是大型语言模型(LLMs)的越狱攻击,包括各种类型的攻击方式。作者们提出了一个叫做SelfDefend的框架,灵感来自传统的安全概念中的shadow stacks。这个框架通过建立一个影子LLM来检测和防御攻击,同时不影响主模型的正常运行。 接下来,我需要提取关键点:越狱攻击的类型、SelfDefend的工作原理、使用数据蒸馏优化模型、以及实验结果。这些信息需要简洁地表达出来。 然后,我要确保总结在100字以内,并且语言流畅自然。可能需要调整句子结构,去掉不必要的细节,只保留核心内容。 最后,检查一下是否符合用户的所有要求:中文、100字以内、直接描述内容、没有特定开头。 </think> 文章提出了一种名为SelfDefend的通用大型语言模型(LLM)越狱攻击防御框架。该框架通过建立一个影子LLM来检测潜在攻击,并与目标LLM协同工作以实现基于检查点的访问控制。研究验证了现有LLMs识别有害提示的能力,并通过数据蒸馏优化防御模型,在减少延迟的同时显著提升了防御效果和鲁棒性。...
2025-11-28 06:48:50 | 阅读: 1 |
收藏
|
玄武实验室每日安全 - www.usenix.org
jailbreak
llms
llm
selfdefend
gpt
对BLE邻近追踪协议的全面安全分析
%PDF-1.5%¿÷¢þ1 0 obj<< /Names 3 0 R /Outlines 4 0 R /Pages 5 0 R /Type /Catalog >>endobj2 0 obj...
2025-11-9 10:33:48 | 阅读: 0 |
收藏
|
玄武实验室每日安全 - www.usenix.org
00000
endobj
subtype
rect
annot
设备驱动的末日:你的安卓设备(很可能)存在N-Day内核漏洞
/LastModified /NumberOfPageItemsInPage 3 /NumberofPages 1 /OriginalDocumentID /PageItemUIDToLocat...
2025-9-4 06:45:21 | 阅读: 0 |
收藏
|
玄武实验室每日安全 - www.usenix.org
00000
endobj
subtype
rect
annot
系统寄存器劫持:通过将系统寄存器用于攻击破坏内核完整性
%PDF-1.5%¿÷¢þ1 0 obj<< /Names 3 0 R /Outlines 4 0 R /PageLabels << /Nums [ 0 << /P () >> ] >> /Pa...
2025-8-29 06:49:29 | 阅读: 1 |
收藏
|
玄武实验室每日安全 - www.usenix.org
00000
endobj
subtype
annot
rect
针对定制化大语言模型的指令后门攻击
%PDF-1.5%¿÷¢þ1 0 obj<< /Names 3 0 R /Outlines 4 0 R /Pages 5 0 R /Type /Catalog >>endobj2 0 obj...
2025-5-24 06:50:9 | 阅读: 10 |
收藏
|
玄武实验室每日安全 - www.usenix.org
00000
endobj
subtype
cite
000000
KextFuzz:利用 macOS 手段漏洞检测 Apple Silicon 上的内核扩展
>endobj602 0 obj<< /Filter /FlateDecode /Length 665 >>streamxœmTÁn£0½óÞC¥ö�Æ6„ØU‰@"å°mÕT«½¦à...
2025-5-24 06:49:9 | 阅读: 19 |
收藏
|
玄武实验室每日安全 - www.usenix.org
00000
602
endobj
flatedecode
endstream
大规模语言模型中的偏见漏洞与黑盒逃逸攻击
研究探讨了大型语言模型(LLMs)的安全性问题,提出了一种名为DRA的黑盒攻击方法,通过伪装和重构技术诱导模型生成有害内容,并在包括OpenAI GPT-4在内的多种模型上实现了高达91.1%的成功率。...
2025-4-27 03:1:54 | 阅读: 12 |
收藏
|
玄武实验室每日安全 - www.usenix.org
harmful
academy
llms
security
sciences
A Friend's Eye is A Good Mirror: Synthesizing MCU Peripheral Models from Peripheral Drivers
Authors: Chongqing Lei and Zhen Ling, Southeast University; Yue Zhang, Drexel University; Yan Yang a...
2024-5-3 03:57:42 | 阅读: 5 |
收藏
|
USENIX - www.usenix.org
hardware
perry
firmware
security
A Linear Reconstruction Approach for Attribute Inference Attacks against Synthetic Data
Authors: Meenatchi Sundaram Muthu Selva Annamalai, University College London; Andrea Gadotti and Luc...
2024-5-3 03:57:42 | 阅读: 3 |
收藏
|
USENIX - www.usenix.org
sdg
inference
synthetic
statistical
reasonable
SDFuzz: Target States Driven Directed Fuzzing
Authors: Penghui Li, The Chinese University of Hong Kong and Zhongguancun Laboratory; Wei Meng, The...
2024-5-3 03:57:42 | 阅读: 51 |
收藏
|
USENIX - www.usenix.org
sdfuzz
directed
analysis
fuzzers
"I chose to fight, be brave, and to deal with it": Threat Experiences and Security Practices of Pakistani Content Creators
Authors: Lea Gröber, CISPA Helmholtz Center for Information Security and Saarland University; Waleed...
2024-5-3 03:57:42 | 阅读: 3 |
收藏
|
USENIX - www.usenix.org
creators
lahore
cispa
pakistan
sciences
FEASE: Fast and Expressive Asymmetric Searchable Encryption
Authors: Long Meng, Liqun Chen, and Yangguang Tian, University of Surrey; Mark Manulis, Universität...
2024-5-3 03:57:42 | 阅读: 3 |
收藏
|
USENIX - www.usenix.org
ase
expressive
fease
encryption
abe
Critical Code Guided Directed Greybox Fuzzing for Commits
Authors: Yi Xiang, Zhejiang University NGICS Platform; Xuhong Zhang, Zhejiang University and Jianghu...
2024-5-3 03:57:42 | 阅读: 3 |
收藏
|
USENIX - www.usenix.org
waflgo
zhejiang
ngics
xiao
More Simplicity for Trainers, More Opportunity for Attackers: Black-Box Attacks on Speaker Recognition Systems by Inferring Feature Extractor
Authors: Yunjie Ge, Pinji Chen, Qian Wang, Lingchen Zhao, and Ningping Mou, Wuhan University; Peipei...
2024-5-3 03:57:42 | 阅读: 4 |
收藏
|
USENIX - www.usenix.org
voxcloak
asr
snr
wuhan
kong
Page-Oriented Programming: Subverting Control-Flow Integrity of Commodity Operating System Kernels with Non-Writable Code Pages
Authors: Seunghun Han, The Affiliated Institute of ETRI, Chungnam National University; Seong-Joong K...
2024-5-3 03:57:42 | 阅读: 2 |
收藏
|
USENIX - www.usenix.org
remapping
kim
affiliated
chungnam
etri
A Linear Reconstruction Approach for Attribute Inference Attacks against Synthetic Data
Authors: Meenatchi Sundaram Muthu Selva Annamalai, University College London; Andrea Gadotti and Luc...
2024-5-3 03:57:42 | 阅读: 2 |
收藏
|
USENIX - www.usenix.org
sdg
inference
synthetic
reasonable
SoK (or SoLK?): On the Quantitative Study of Sociodemographic Factors and Computer Security Behaviors
Authors: Miranda Wei, University of Washington; Jaron Mink, University of Illinois at Urbana-Champai...
2024-5-3 03:57:42 | 阅读: 2 |
收藏
|
USENIX - www.usenix.org
behaviors
security
papers
guidelines
EaTVul: ChatGPT-based Evasion Attack Against Software Vulnerability Detection
Authors: Shigang Liu, CSIRO's Data61 and Swinburne University of Technology; Di Cao, Swinburne Unive...
2024-5-3 03:57:42 | 阅读: 5 |
收藏
|
USENIX - www.usenix.org
adversarial
eatvul
swinburne
csiro
How Does a Deep Learning Model Architecture Impact Its Privacy? A Comprehensive Study of Privacy Attacks on CNNs and Transformers
Authors: Guangsheng Zhang, Bo Liu, Huan Tian, and Tianqing Zhu, University of Technology Sydney; Min...
2024-5-3 03:57:42 | 阅读: 3 |
收藏
|
USENIX - www.usenix.org
cnns
inference
liu
Fast and Private Inference of Deep Neural Networks by Co-designing Activation Functions
Authors: Abdulrahman Diaa, Lucas Fenaux, Thomas Humphries, Marian Dietz, Faezeh Ebrahimianghazani, B...
2024-5-3 03:57:42 | 阅读: 2 |
收藏
|
USENIX - www.usenix.org
inference
mpc
accuracy
competitive
activation
Previous
-1
0
1
2
3
4
5
6
Next