unSafe.sh - 不安全

Can an LLM Formally Verify Your Code? A Feasibility Study May 2026 · ML/security researchWhen a language model tells you “this funct...
2026-5-7 00:0:0 | 阅读: 8 | 收藏 |

Toooold - toooold.com
theorem lean llm xs gate

The Blessing of Dimensionality: How TurboQuant Uses the JL Lemma to Compress KV Caches with Zero Bias 好的，我现在需要帮用户总结这篇文章的内容，控制在100字以内。首先，我得仔细阅读文章，抓住主要信息。文章主要讲的是TurboQuant技术，用于优化本地运行的大语言模型的内存使用。传统方法如INT4或FP8虽然能减少内存，但引入了确定性偏差，影响模型性能。TurboQuant利用Johnson-Lindenstrauss引理和Polarization Identity，将KV缓存压缩到3位或2位，几乎无损。这样内存减少5倍，上下文长度从8K提升到40K。关键点在于无偏压缩，保护几何关系。接下来，我需要把这些要点浓缩成一句话。确保涵盖技术名称、压缩方式、效果和优势。可能的结构：TurboQuant通过几何方法压缩KV缓存，实现高效内存使用和长上下文支持。最后检查字数，确保不超过100字。 </think> TurboQuant通过几何方法将KV缓存压缩至3-2位，实现5倍内存节省和40K上下文长度，在保持零精度损失的同时解决传统量化偏差问题。...
2026-3-28 00:0:0 | 阅读: 1 | 收藏 |

Toooold - toooold.com
jl phi turboquant dimension

Mozart rolls a dice to Bach and Ramanujan 嗯，用户让我总结这篇文章的内容，控制在一百个字以内，而且不需要特定的开头。首先，我需要仔细阅读文章，理解其主要观点。文章主要讨论莫扎特的音乐骰子游戏K.516f，以及它如何与现代大型语言模型（LLM）的机制相联系。莫扎特的游戏通过掷骰子生成独特的16小节小步舞曲，每个小节的选择是独立的，没有上下文依赖。这类似于LLM中的自回归生成模型，但上下文窗口为零。接着，文章提到拉马努金的和函数如何解释为什么16个小节在心理和和谐上感觉完整。最后，文章将巴赫的赋格曲与多头自注意力机制相比较，指出巴赫的作品扩展了莫扎特的框架到高维结构。所以，总结时需要涵盖莫扎特的游戏机制、与LLM的关系、拉马努金的数学解释以及巴赫与自注意力机制的联系。控制在100字以内，直接描述内容。 </think> 文章探讨了莫扎特《音乐骰子游戏》如何体现严格的自回归生成机制，并将其与大型语言模型、拉马努金和声方程及巴赫赋格曲联系起来。通过数学分析揭示了音乐创作中的对称性和周期性规律。...
2026-3-16 00:0:0 | 阅读: 0 | 收藏 |

Toooold - toooold.com
mozart ramanujan bach measure grid

The Least Action Nature of AdvJudge-Zero: A Lagrangian Perspective on LLM Steering 好的，我现在需要帮用户总结一篇文章的内容，控制在100字以内。首先，我得仔细阅读用户提供的文章内容，了解其主要观点和结构。文章主要介绍了一个名为AdvJudge-Zero的方法，用于攻击LLM作为裁判系统。攻击通过附加特定的控制令牌来翻转裁判的二元决策。方法的核心在于使用beam search寻找低困惑度的令牌序列，以最大化最后一层的logit差距。接下来，文章从拉格朗日优化的角度解释了该方法的工作原理，强调了能量成本和目标势能之间的平衡。此外，还讨论了束搜索如何保持模型的确定性路径，并通过几何“软模式”绕过裁判的决策边界。总结时，我需要抓住关键点：攻击方法、机制、优化原理以及成功的原因。确保在100字以内清晰传达这些信息。 </think> AdvJudge-Zero是一种通过附加特定控制令牌翻转LLM裁判二元决策的攻击方法。它利用beam search寻找低困惑度的令牌序列，最大化最后一层logit差距。该方法基于拉格朗日优化和最小作用原理，在保持模型确定性路径的同时绕过裁判的高能屏障。...
2026-2-21 00:0:0 | 阅读: 0 | 收藏 |

Toooold - toooold.com
advjudge judge partial basin

The Mechanism of Logit Gap Steering: A Unified View of Prompts, Vectors, and Low-Rank Adaptation 好的，我现在需要帮用户总结一篇文章的内容，控制在100字以内。用户的要求很明确，不需要以“文章内容总结”或“这篇文章”这样的开头，直接写描述即可。首先，我得仔细阅读这篇文章。文章主要讲的是Logit Gap Steering方法，作者Tony和同事在几个月前发表的论文。他们展示了如何通过优化token序列来调整LLM的行为，特别是在“拒绝”和“合规”之间架起桥梁。接下来，文章讨论了为什么这个方法有效。他们提出了一个统一的框架，将Prompt Steering和Activation Steering视为相同的操作，只是约束不同。关键在于两个LLM的基本属性：残差流的Identity Propagator特性和安全对齐的Low Rank结构。残差流作为信息传递通道，保持了信号的完整性，使得在输入层添加优化后的token能够有效影响最终输出。而Low Rank结构意味着安全行为可以通过简单的线性调整来控制。最后，文章还提到了工程实现的方法，即将prompt优化视为向量搜索问题，并用梯度计算和词汇投影来找到最佳token组合。总结起来，文章的核心是解释Logit Gap Steering的有效性，并提出其背后的理论框架和关键因素。因此，在总结时需要涵盖这些要点：Logit Gap Steering的作用、统一框架、残差流特性、Low Rank结构以及工程实现方法。现在，我需要将这些内容浓缩到100字以内。要确保涵盖主要观点：Logit Gap Steering的工作原理、统一框架、残差流的信息传递、Low Rank的安全行为结构以及工程实现中的向量搜索方法。可能的表达方式是：“文章探讨了Logit Gap Steering技术的工作原理及其有效性。通过将提示工程视为向量搜索问题，并利用残差流的信息传递特性和低秩安全对齐结构，该方法成功实现了对大型语言模型行为的精准控制。” 这样既涵盖了技术名称、工作原理、关键因素和效果，又符合字数限制。 </think> 文章探讨了Logit Gap Steering技术的工作原理及其有效性。通过将提示工程视为向量搜索问题，并利用残差流的信息传递特性和低秩安全对齐结构，该方法成功实现了对大型语言模型行为的精准控制。...
2026-2-9 00:0:0 | 阅读: 0 | 收藏 |

Toooold - toooold.com
delta mathrm steering gap residual

LLM as My Pair Researcher: Prover–Validator Collaboration and the Road to Logit-Gap Steering 嗯，用户希望我用中文帮他总结一篇文章的内容，控制在100字以内，而且不需要以“文章内容总结”或“这篇文章”这样的开头，直接写描述即可。首先，我需要仔细阅读并理解这篇文章。文章主要讲述了作者如何与AI合作，将直觉转化为具体的研究方向——logit gap steering。他们通过分析解码过程中的token分布，发现拒绝行为并非完全不存在顺从的可能，而是概率上的优势。接着，他们提出了prover-validator模式，其中AI生成候选假设，人类验证并优化这些假设。这种方法帮助他们探索了如何通过小的扰动来改变模型的行为。接下来，我需要将这些关键点浓缩到100字以内。要确保涵盖主要研究方向、方法和发现。同时，语言要简洁明了，避免使用复杂的术语。最后，检查字数是否符合要求，并确保内容准确传达文章的核心信息。 </think> 文章探讨了如何通过AI与人类合作开发新的研究方向——logit gap steering。作者与AI协作，将直觉转化为具体研究，并提出prover-validator模式：AI生成候选假设，人类验证并优化。这种方法帮助探索如何通过小的扰动改变模型行为。...
2026-1-21 00:0:0 | 阅读: 0 | 收藏 |

Toooold - toooold.com
prover validator refusal gap tony

The Physics of mHC: Why Deep Learning Needs Energy Conservation 文章探讨了mHC方法如何从物理角度出发，通过构建被动系统稳定深度神经网络。作者指出传统神经网络作为“主动放大器”会导致信号能量爆炸，并提出通过双重随机矩阵和Sinkhorn-Knopp算法实现能量守恒与梯度稳定。这种方法确保了信号质量和能量不会被任意放大或缩小。...
2026-1-5 00:0:0 | 阅读: 0 | 收藏 |

Toooold - toooold.com
ij mass sums quad

The Mathematics of Baby Shower Games: Solomonoff Inference in Action Last weekend, I found myself applying data science in an unexpected setting: a...
2024-12-13 00:0:0 | 阅读: 3 | 收藏 |

Toooold - toooold.com
guesses gap optimal inches territory

The missing knowledge snippets of AI It is a live blog post of some knowledge snippets of AI to bridge the gap amon...
2024-12-9 08:0:0 | 阅读: 2 | 收藏 |

Toooold - toooold.com
llm memory reasoning lnkd lora

Building a Lightweight Financial Agent: A Flexible Approach to Tool Use and Orchestration In the rapidly evolving field of AI agents, there’s a growing trend towards co...
2024-9-15 08:0:0 | 阅读: 8 | 收藏 |

Toooold - toooold.com
income financials analysis horizon company2

Agentic Systems: Snake Oil or the Future of AI? The short answer: it’s more nuanced—agentic systems have potential, but only i...
2024-8-21 08:0:0 | 阅读: 4 | 收藏 |

Toooold - toooold.com
agentic llms limitations llm reasoning

Those Magnificent underdogs competing ChatGPT Disclaimer: the open source community and the AI community evolve so fast. Thi...
2023-4-8 08:0:0 | 阅读: 10 | 收藏 |

Toooold - toooold.com
alpaca llama tuning gpt llm

Understand Twitter’s Recommendation system with a diagram using GPT-4 Disclaimer: the following blog post is mostly generated by GPT-4. The image is...
2023-3-31 08:0:0 | 阅读: 14 | 收藏 |

Toooold - toooold.com
tweets stroke classdef network

Collaborating with an AI Assistant to Discover a JPEG2000 Decoder Vulnerability How I teamed up with ChatGPT, an AI language model, to identify and analyze a...
2023-3-20 08:0:0 | 阅读: 11 | 收藏 |

Toooold - toooold.com
chatgpt cblk jpeg2000 log2 tile

Make a CatGPT out of ChatGPT Since ChatGPT is good at hallucination, why not make a CatGPT out of it? Let’s...
2023-2-2 08:0:0 | 阅读: 9 | 收藏 |

Toooold - toooold.com
laser feline manifold accuracy chatgpt

Guess the size of an atomic bomb and an iOS supply chain attack Disclaimer: Nothing in this blog is related to the author’s current day-to-day...
2022-8-24 08:0:0 | 阅读: 8 | 收藏 |

Toooold - toooold.com
xcodeghost fermi bias sampling estimate

Zero hacking problem: do we really protect the customers? Disclaimer: Nothing in this blog is related to the author’s day-to-day work. T...
2022-8-17 08:0:0 | 阅读: 5 | 收藏 |

Toooold - toooold.com
jason probability security coin fair

Linkedin spam: a case study of robust feature engineering Disclaimer: Nothing in this blog is related to the author’s day-to-day work. T...
2022-8-3 08:0:0 | 阅读: 9 | 收藏 |

Toooold - toooold.com
attackers adversarial drift

Measure the unmeasurable: botnet and German tanks Disclaimer: Nothing in this blog is related to the author’s day-to-day work. T...
2022-7-25 08:0:0 | 阅读: 7 | 收藏 |

Toooold - toooold.com
jason frac tanks network sampling

Nine cars, twenty-five horses and beyond TL, DR“25 horses problem” and “9 cars problem” can have a general solution a...
2022-6-11 08:0:0 | 阅读: 7 | 收藏 |

Toooold - toooold.com
horses cars races tensor dimension