AI-Powered Memory Safety with the Pointer Ownership Model
嗯,用户让我帮忙总结一篇文章,控制在一百个字以内,而且不需要特定的开头。首先,我得仔细阅读文章内容,理解主要观点。 文章主要讲的是POM(Pointer Ownership Model)模型的更新,用于解决C和C++中的内存安全问题,特别是use-after-free和double-free漏洞。这些漏洞很常见,而且很难检测和修复。 POM通过将指针分类为负责任、不负责任等类别,并结合LLM和SAT求解器来自动创建和验证p-model,从而帮助开发者检测内存安全问题。这种方法提高了代码的安全性和稳定性。 现在我需要把这些要点浓缩到100字以内。要确保涵盖POM的目标、更新内容以及使用的技术手段。 可能的结构是:介绍POM的目的,说明更新包括什么,以及如何利用LLM和SAT求解器来实现自动化。 最后检查字数,确保不超过限制。 </think> 文章讨论了增强型指针所有权模型(POM)在C语言中的应用,旨在通过静态分析检测和修复内存安全问题。POM将指针分为负责任、不负责任等类别,并结合大型语言模型(LLM)和SAT求解器实现自动化的p-model生成与验证,以提高代码的安全性和稳定性。 2025-12-3 05:0:0 Author: www.sei.cmu.edu(查看原文) 阅读量:1 收藏

In October 2025, CyberPress reported a critical security vulnerability in the Redis Server, an open-source in-memory database: CVE-2025-49844 allowed authenticated attackers to achieve remote code execution through a use-after-free (CWE-416) flaw in the Lua scripting engine.

In 2024, another prominent temporal memory safety flaw was found in the Netfilter subsystem in the Linux kernel: CVE-2024-1086. This issue caused the nf_hook_slow() function to free memory twice (CWE-415: Double Free), allowing an attacker to exploit this double-free vulnerability to execute arbitrary code of their own choosing.

As the above examples illustrate, bugs related to temporal memory safety, such as use-after-free and double-free vulnerabilities, are challenging issues in C and C++ code. These bugs are difficult to detect and fix, often resulting in significant security vulnerabilities and system instability. Each year from 2006-2018, approximately 70 percent of vulnerabilities to which Microsoft assigned a CVE have been memory safety issues (temporal or spatial memory safety issues, such as buffer overflows), dropping only to around 50 percent in 2023. In its list of most dangerous software weaknesses, MITRE ranked CWE-416 seventh in 2022 and fourth in 2023. CWE-416 is also the source of more than a third of the high-severity security bugs in the Chromium codebase. (CVE is the MITRE-maintained inventory of identified vulnerabilities in systems; CWE is the MITRE-maintained inventory of patterns of common weakness in systems, such as use-after-free.)

This post, primarily adapted from the recently published technical report Design of Enhanced Pointer Ownership Model for C, highlights recent updates to the Pointer Ownership Model (POM). POM is a modeling framework designed to improve the ability of developers to statically analyze C programs for errors involving dynamic memory. To make a program comply with POM, a developer needed to identify the program’s “responsible” pointers, that is, pointers whose objects must be explicitly freed before the pointers themselves may be destroyed. Any program that complies with POM can be statically analyzed to ensure that the design is consistent and secure and that the code correctly respects the principles of pointer ownership.

POM can also be used to diagnose and eliminate many dynamic memory errors from C programs. The main drawback of the original POM (model and tooling) was the extensive manual effort required for developers to extract POM-relevant information from a particular codebase and formalize it into a corresponding pointer model (known as a p-model).

There have been two significant developments since that initial 2013 POM was released: First, Rust has grown significantly in popularity and adoption, offering the flexibility and safety provided by C and C++ but with guarantees of memory safety. Hence, there is more sensitivity to memory safety in general. Second, LLMs provide novel capabilities in various areas of software engineering, bringing with them significant potential but also significant risks to security and functionality. Hence, LLMs offer the potential to reduce the manual burden of building a p-model, making adoption and application easier. Both developments motivate our work to enhance the original POM, for improved capabilities and for fully automated p-model creation.

Our recent updates to POM include:

  • the use of large language models (LLMs) to complete a p-model,
  • an improved mechanism to prevent use-after-free errors, inspired by Rust’s borrow checker and object lifetimes,
  • improved function argument handling with a new abstraction of diligent or producer arguments, and
  • handling structs, unions, or arrays that contain pointers; and correct handling of ambiguity in assignment operations.

This post also details an approach to automatically check whether a program satisfies an associated p-model, as outlined in the technical report. Beyond the report, this post provides highlights of our team’s latest POM work that incorporates SAT solvers for automated p-model creation and/or validation.

A Two-Stage Approach to Securing Temporal Memory Safety

POM is designed to help developers avoid, identify, and fix temporal memory-safety issues in two stages:

  1. The POM builder automates generation of the p-model.
  2. The POM verifier identifies all remaining POM compliance errors.

The POM builder and verifier are designed to assume that every pointer is in exactly one of the following five categories:

  • Responsible: Responsible pointers are the subset of pointers that shepherd heap memory and ensure that the memory eventually gets freed. Addresses referenced by responsible pointers can be in the following states: GOOD, NULL, or ZOMBIE. A ZOMBIE responsible pointer address is one that points to freed memory, a GOOD pointer address points to valid memory, and a NULL pointer address contains the value NULL (or 0). Each chunk of heap memory (i.e., a heap object) may be accessed directly at most by one GOOD responsible pointer. Responsible pointers never point into the stack, into the data segment of memory, or inside a heap object except its beginning.
  • Irresponsible: Irresponsible pointers are not responsible for allocation or deallocation of the memory they point to. Addresses referenced by irresponsible pointers can be VALID, NULL, or INVALID. The main concern with irresponsible pointers is that they must respect temporal memory safety. The original POM modeled irresponsible pointers but used no tracking mechanism akin to lifetimes, so it did not prevent use-after-free errors. The current POM (model and verifier) does. An irresponsible pointer cannot be assigned the return value of a function that returns a responsible pointer (such as malloc()). Unlike a responsible pointer, an irresponsible pointer can be assigned a value resulting from pointer arithmetic or a value created by C’s address-of operator &.
  • Producer: The address is not mutable. It is used by C to mutate the pointed-to argument (perhaps allocating, freeing, or changing it).
  • Diligent: The address is not mutable. It does not escape its scope, and it is used to read or write the address’s memory without allocating or freeing it.
  • Out of Scope: The POM builder and verifier ignore pointers labeled as out of scope.

If the p-model’s identification of the pointer’s responsibility does not agree with how the pointer is used in code, that constitutes a POM violation, and the verifier should detect it. The user should investigate each violation. If the user decides that the pointer is out of scope (i.e., it is managed by some other mechanism), then the user should add this information to the p-model.

We use the term heap object to denote any single data structure whose memory is allocated with malloc(), calloc(), aligned alloc(), or realloc(). Objects not allocated using one of these functions are not heap objects. The terms responsible, irresponsible, producer, diligent, and out of scope can be treated like type qualifiers in C (e.g., const or restrict). They subtype the pointer variables, irrespective of the variables’ values. As with types, these qualifiers apply to a variable throughout its lifetime. For example, if p is considered to be a responsible pointer, it remains responsible throughout the scope of the variable and cannot cease to be responsible. These terms can apply to local pointers, pointers defined in structs or unions, pointers defined as function arguments, and the return value of a function if it is a pointer type. They can also apply to static pointers, but POM doesn’t support static pointers yet.

Additional POM Updates

Our technical report goes into greater detail, but it is worth noting here that POM also improves on its handling of function arguments that are pointers. A pointer that is passed into a function could be responsible, irresponsible, diligent, producer, or out-of-scope. While the pointer’s responsibility type remains the same during the function’s execution, the pointer’s state may change. Thus, every pointer argument has an initial set of states and a final set of states. These may be identical but need not be.

The new POM also has a design for types that contain pointers, which the old POM did not handle. We define a composite type as any C data type that can contain a pointer. Composite types consist of pointer types and structs, unions, arrays, and pointers that contain a composite type. We distinguish these from non-composite types, which include structs, unions, and arrays that do not contain any pointers. A composite object is an object of a composite type. A responsible composite object is a composite object with at least one responsible pointer, and an irresponsible composite object is a composite object with at least one irresponsible pointer. Note that a composite object can be both responsible and irresponsible, based on the pointers it contains.

A responsible composite object with exactly one responsible pointer has the same responsible states as the pointer. That is, if the responsible pointer is GOOD, the composite object’s responsible state can be inferred as GOOD. A composite object with more than one responsible pointer will also have a responsible state derived from the responsible pointers’ states. Likewise, a composite object with exactly one irresponsible pointer itself can have the same states as the pointer. That is, if the irresponsible pointer is VALID, the composite object’s irresponsible state can be inferred as VALID. A composite object with more than one irresponsible pointer will also have an irresponsible state derived from the responsible pointers’ states.

In C, many heap objects are not accessible directly via a pointer defined on the stack but can be accessed indirectly through two or more pointers. An example is the third element in a linked list. We define a C-path as a way to access any object in memory in C. It starts off with a global or local variable and then consists of a (possibly empty) sequence of array accesses (e.g., a[i]), pointer dereferences (e.g., *p), struct membership (e.g., s.a), and union membership (e.g., u.a). C-paths are a lot like file paths. Composite types are what one uses to build networks of heap objects in memory. The pointers must be VALID (for irresponsible pointers) or GOOD (for responsible pointers). A heap object that can’t be referenced by any C-path indicates a memory leak. If no memory leaks exist in a program at a point in time, then every heap object has at least one C-path to reference it. In a memory-safe program with no out-of-scope pointers, at any point during program execution, every heap object has exactly one C-path where every pointer in the path is responsible. We call this “the responsible C-path.” It is a violation of POM to free a pointer via a C-path that has at least one irresponsible pointer in it. Note that any variables before the first pointer live on the stack or global segment, and everything past the first pointer must live on the heap.

Control Flow and Responsible Pointer States

A pointer can be in multiple states at once. We always assume that the states of a pointer can be determined statically. For any two states, branching can create a pointer that could be in both states. For example, malloc() returns a responsible pointer that could be GOOD or NULL. This can be a source of trouble. In Standard C, there is no way to distinguish GOOD responsible pointers from uninitialized pointers. This (among other things) requires a developer to maintain internal discipline to make sure that only GOOD pointers are passed to most library functions. POM is designed to keep track of the states of pointers and issue warnings. For example, the POM verifier will warn if a pointer that might be uninitialized or NULL is dereferenced.

Implementing POM

Each p-model is stored in a YAML file. The figure below shows an example of C source code and its associated p-model.

Screenshot 2025-12-02 at 10.57.57 AM

Figure 1: The left shows an example of C source code and, to the right, its associated p-model.

We have now developed two methods to create a p-model file: LLM-based or SAT-solver-based. The SAT-solver method creates a p-model after verifying that the program satisfies POM constraints and cannot create a p-model for the program if it doesn’t. The LLM-based method can create a p-model for the program, regardless of a SAT-solver’s determination. If you only use an LLM to generate a p-model, you may not know if the code is compliant or if the LLM made a mistake. That is, an LLM always generates a p-model, even if the program violates POM, and the p-model is incorrect. In contrast, a SAT solver always generates a p-model if the program can comply with POM, but if there are multiple p-models, the SAT solver does not know which p-model is correct, and if there are no valid p-models, the SAT solver cannot generate one. Studying how LLMs and SAT solvers can interact to maximize their strengths and minimize their weaknesses is future work.

Input to the p-model builder includes the source code and output from the Clang compiler tool. First run Clang on the source code to generate an Abstract Syntax Tree (AST), then serialize the AST to a JavaScript Object Notation (JSON) file. Clang can also produce an intermediate representation (IR) file, which can be useful to both the POM builder and verifier. The builder can use the AST or IR to identify functions and other regions of text (such as classes or structs) to feed to the LLM, and the verifier can use the AST and IR to confirm that they comply with the p-model.

Using automated static analysis to build p-models is expensive and time-consuming to build, maintain, and debug. For example, determining the responsibility of a pointer inside a struct requires inspecting how the struct is used throughout the program. Where the original POM required manual completion of a p-model, today we use an LLM to help complete automated p-model generation. We hypothesize that an LLM may be able to correctly ascertain the responsibility of many pointers that static analysis alone may not resolve correctly and do it faster and more accurately than a human could (i.e., with a greater percentage of correct labels in the p-model). Manually creating or verifying a p-model is slow and impedes its use. For example, the specifications used by Frama-C’s library must be proofread by the user. A p-model that can be generated automatically does not have this impediment. We also hypothesize that an LLM may be better at discerning programmer intent than static analysis alone, especially if the code is defective or violates POM.

A risk of using LLMs is that they sometimes hallucinate, making wrong statements, often in confident language. However, since the verifier will assess the accuracy of a p-model, it will emit warnings on any p-model that the program does not comply with, therefore preventing any hallucinations from producing a “correct” p-model. We have started to test how successful the LLM is in filling out the p-model. Since a p-model can be completed manually, it is simple to “grade” the LLM, and the LLM’s performance is a major component of this research. As we continue to develop POM based on this design and then test it, we want to investigate which LLMs perform best and how to optimize LLM prompts to output good p-models.

The verifier’s job is to confirm that the program complies with the p-model provided to it, regardless of how the p-model was constructed. Thus, the verifier would be able to flag any LLM-generated hallucinations as non-compliance, and it would also catch human error if the p-model is generated manually.

In our initial plan, a p-model would be verified using static analysis. Once the program’s AST is serialized in JSON format, the verifier can ingest the AST to track and build an internal pointer model of the pointers in the program. A simple dictionary is used to map functions found in the AST to the function’s internal control flow, argument pointers, local variable pointers, and return type pointers. Given the tree structure of the AST JSON, each function definition will contain all necessary information to build the internal pointer model. Based on the AST node type, the internal pointer model will digest the AST node accordingly and update the associated function’s internal control flow if necessary. After the internal pointer model has been fully created from the digested AST JSON for each function, the p-model is compared to the end state of the internal pointer model after following the internal control flow. First, the p-model checks for the existence of all declared function argument pointers, local variable pointers, and the return pointer type. If there are any missing or extraneous pointers, the verifier will warn the user of the discrepancy. Afterward, the function argument pointers and return pointer type in the p-model are verified for correctness given the internal control flow. Once the function argument pointers and return pointer type in the p-model are verified, the local variable pointers are verified for correctness given the internal control flow. Any verification errors are reported back to the user as warnings. (Appendix B in our technical report provides the details of our planned implementation and includes a high-level flow diagram of verifying a p-model.)

However, our current verifier uses a SAT solver, which is cheaper and simpler. There is an older, incomplete verifier that checks if the AST complies with the POM. The SAT-solver verifier examines the LLVM IR generated from the source code. It runs our constraint generator and the SAT solver. Its input can include p-model files but does not need to.

The verifier output is designed to help developers quickly understand if the code is POM-compliant.

  • If the finding is SAT (it is POM-compliant), developers are provided with the validation details.
  • If the finding is UNSAT (it is not POM-compliant), developers are pointed to an UNSAT core (a subset of the original clauses that is sufficient to prove unsatisfiability) to help them understand the problem so they can fix it. Related output files provide traces to both the source code (with line numbers and variable names) and LLVM IR code (with line numbers and variable names).

If the constraints are satisfiable, the following files are generated:

  • solution.json: Assignment of true or false to each of the named variables appearing in constraints.txt.
  • solution.txt: Raw output from SAT solver, using numeric variable IDs.

If the constraints are unsatisfiable, the following files are generated:

  • proof.drat: A proof of unsatisfiability.
  • core.unsat: The subset of clauses from constraints.dimacs that are used in the above proof.
  • core.unsat.named: Same as core.unsat, except using descriptive variable names instead of numeric variable IDs.

Future Pointer Ownership Model Updates

Our POM formal model can be useful to prove partial temporal memory safety in C code. We are now developing code for automating p-code creation and validation as well as an automated testing framework to run experiments. Our design choice to use an LLM for p-model generation support is intended to lower manual effort and increase correctness, but it risks hallucinations. We expect that such hallucinations will cause the verifier to produce warnings about the program violating the p-model.

We are currently completing tests to help us understand the impact of using LLMs, SAT solvers, and other design choices. Our tests use temporal memory safety-relevant subsets of the Juliet C/C++ v1.3 test suite, plus on some additional open-source and project-created test code. Results of that testing will be published soon in presentations and a forthcoming technical report, plus eventually published in a conference paper (to be submitted soon). As new publications happen, we will update the POM collections webpage. POM is intended to help developers avoid, identify, and fix temporal memory-safety issues with its mental model, automated p-model generation, and automated verification. If successful at validating temporal safety, POM could improve the security and functionality of much of the huge amount of C code currently in use at a low cost (due to full automation) and with no performance reduction. Future work could investigate how out-of-scope pointers interact with responsible and irresponsible pointers in data structures such as doubly linked lists and reference-counted pointers and then possibly extend POM to include such data structures.

Using POM requires some training on what the model is and how to use the tooling we’ve developed to support it. Many software engineers are unfamiliar with SAT solvers, so in cases where the output says UNSATISFIABLE (UNSAT), without education and practice, even the subset of clauses resulting from DIMACS can be hard to understand at first. When the result is UNSAT, engineers need to then modify the code and/or p-models, and the DIMACS output helps them examine the clauses that can’t all be true and then make changes. We provide demos (with demo code to analyze and step by step instructions) and explanations in the README.mdand README.sat.solver.md in the code release, plus we enhance the DIMACS output to help users identify the associated code in the IR and source code. We would like to further improve the tooling to enable more automation and reduce the learning burden on engineers. Also, in the future we may develop a workshop and/or more training material, to help new users to quickly jump-start using and benefitting from POM.

Future work could increase C language coverage in POM by supporting the alloca() function, which would require modifying the C-path definition; supporting static pointers; and supporting tracing responsible or irresponsible pointers through integer casts or casts to any other non-pointer types. Another area of future work could extend POM’s temporal memory safety checks to also encompass spatial memory safety, including vulnerabilities such as buffer overflows and Heartbleed.

文章来源: https://www.sei.cmu.edu/blog/ai-powered-memory-safety-with-the-pointer-ownership-model/?utm_source=blog&utm_medium=rss&utm_campaign=my_site_updates
如有侵权请联系:admin#unsafe.sh