Python Wheel (Zip) Parser Differential Vulnerability v2.0
嗯,用户让我用中文总结一下这篇文章的内容,控制在100字以内,而且不需要用“文章内容总结”或者“这篇文章”这样的开头,直接写描述。好的,我先看看这篇文章讲的是什么。 文章主要讨论了PyPI和UV在处理Wheel文件时的漏洞。这个漏洞允许上传一个特制的Wheel文件,当用PIP安装时会表现出一种行为,而用非Python工具比如UV安装时则会表现出另一种行为。这可能被用来隐藏恶意payload,逃避检测。 文章还提到尽管之前有过报告,PyPI和UV都做了一些改进来保护生态系统,但经过审核后发现仍然存在多个bug和行为差异,使得这种差异攻击仍然可行。漏洞的严重性被评估为中等。 总结的时候需要涵盖关键点:特制Wheel文件、两种安装工具的不同行为、隐藏恶意payload的能力、漏洞依然存在以及严重性评估。控制在100字以内的话,可能需要精简一些细节。 比如:“文章指出PyPI和UV在处理特制Wheel文件时存在漏洞,可能导致Python和非Python工具安装不同行为的包。尽管此前有所改进,但因zip规范复杂且解析器差异,该漏洞仍可被利用以隐藏恶意payload。” 这样大约75字左右。 再检查一下是否涵盖了所有关键点:特制Wheel文件、两种安装方式的不同结果、漏洞原因(zip规范复杂)、改进后的仍然存在bug、隐藏恶意payload的能力以及严重性评估(中等)。看起来都涵盖了。 最后确认一下语言是否流畅自然,没有使用过于复杂的术语,并且符合用户的要求。 </think> 文章指出PyPI和UV在处理特制Wheel文件时存在漏洞,可能导致Python和非Python工具安装不同行为的包。尽管此前有所改进,但因zip规范复杂且解析器差异,该漏洞仍可被利用以隐藏恶意payload。 2026-1-21 23:59:46 Author: github.com(查看原文) 阅读量:0 收藏

Summary

It is still possible (albeit with significantly more effort) to upload a specially crafted Wheel file (i.e. zip) to PyPI that when installed with PIP (or another Python zipfile based tool) a package is installed that behaves one way, and when installed by a non-Python tool (particularly uv) a package is installed that behaves another way.

This vulnerability continues to exist because the zip specification is incredibly complex and sufficiently ambiguous that having any two zip parsers behave in a way that prevents differential based attacks is difficult.

As before, the security implications of this capability are concerning, as:
A benign payload can be delivered specifically to Python users, while uv users receive a malicious payload.
A malicious payload can be served to Python users, while security companies and analyzers built on non-Python parsers may see a benign payload.

After previous reports, both PyPI and UV (CVE-2025-54368) made large gains in protecting the Python ecosystem from this vulnerability.

However, after auditing the changes, multiple bugs and behaviour differences present in both UV, PyPI's validation and the Python zipfile library that still allow for Wheel differential attacks.

Severity

Moderate - This vulnerability can be leveraged to hide malicious payloads that evade detection.

Proof of Concept

Using PIP to install the package:

$ pip install cbwheeldiff2
Collecting cbwheeldiff2
  Downloading cbwheeldiff2-0.0.1-py2.py3-none-any.whl.metadata (148 bytes)
Downloading cbwheeldiff2-0.0.1-py2.py3-none-any.whl (1.4 kB)
Installing collected packages: cbwheeldiff2
Successfully installed cbwheeldiff2-0.0.1

$ python3
Python 3.12.3 (main, May 26 2025, 18:50:19) [GCC 13.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import cbwheeldiff2
I was installed using Python's zipfile.
>>>

Using UV to install the package:

$ uv pip install cbwheeldiff2
Using Python 3.12.3 environment at: env
Resolved 1 package in 424ms
Prepared 1 package in 201ms
Installed 1 package in 1ms
 + cbwheeldiff2==0.0.1

$ python3
Python 3.12.3 (main, May 26 2025, 18:50:19) [GCC 13.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import cbwheeldiff2
I was installed with UV. It's so fast!!
>>>

In essence, the package cbwheeldiff2 can be viewed as two interleaved zipfiles, specifically designed to pass the stricter validation introduced earlier in 2025, and to still extract cleanly with UV and PIP (and other Python libraries).

Below is an overview of how each implementation sees the zip, and how the records are interleaved.

POS UV Python zipfile PyPI validation
1 LF: "dist/METADATA" LF: "dist/METADATA" LF: "dist/METADATA"
2 METADATA content METADATA content Skip content
3 LF: "dist/WHEEL" LF: "dist/WHEEL" LF: "dist/WHEEL"
4 WHEEL content WHEEL content Skip content
5 LF: "./dist/RECORD", w/Descriptor LF: "./dist/RECORD", w/Descriptor LF: "./dist/RECORD", w/Descriptor
6 RECORD content (deflated) RECORD content (deflated)
+ trailing junk (uses CD entry for sizes)
Skip content based on compress_size in CD entry from zipfile.
7 DD
8 LF: "pkg/__init__.py"
+ Unknown Extras
9 DD DD
10 LF: "pkg/fix/" - directory LF: "pkg/fix/"
11 LF: "dist/RECORD" LF: "dist/RECORD"
12 RECORD content (deflated) Skip content
13 LF: "pkg/__init__.py" LF: "pkg/__init__.py"
14 __init__.py content (deflated)
+ trailing junk
Skip content
15 __init__.py content
16 CD: "dist/METADATA" CD: "dist/METADATA" CD: "dist/METADATA"
17 CD: "dist/WHEEL" CD: "dist/WHEEL" CD: "dist/WHEEL"
18 CD: "./dist/RECORD", DD present, compress_size ignored CD: "./dist/RECORD", compress_size = POS 6 - 8 CD: "./dist/RECORD"
19 CD: \0 present - skipped CD: "dist/RECORD" CD: "dist/RECORD" - used to validate directory entries
20 CD: \0 present - skipped CD: "pkg/__init__.py"
+ COMMENT
CD: "pkg/__init__.py"
+ COMMENT
21 CD: "pkg/__init__.py"
+ Unknown Extras
22 CD: "pkg/fix/" - directory CD: "pkg/fix/"
23 EOCD: offset + num entries checked EOCD: offset read EOCD: exists

After extraction, uv and pip will have different pkg/__init__.py and different dist/RECORD.

Further Analysis

The following are all the bugs identified. They are roughly ordered by severity, from most to least severe.

UV Bug: CD comments are not handled

Central Directory entries can include per-entry comments.

UV's implementation, while it parses the filename and extras, fails to handle the comment field.

This allows zip content to be made visible to UV, but remains hidden to Python.

The PoC above uses this technique to hide the UV specific entry for "pkg/init.py" from Python.

Differential: nulls in filenames

In Python's zipfile library, filenames that contain a null byte "\0" are truncated at the first null byte.

In the uv-extract crate, if a Central Directory entry contains a null byte the processing for that entry is skipped entirely. If a Local File entry contains a null byte a warning is produced.

Finally, PyPI does not do any handling of null bytes in filenames, except comparing them to the Central Directory as parsed by the zipfile library.

The PoC above uses this technique to skip the Python specific entries for "pkg/init.py" and "dist/RECORD" from UV.

PyPI Bug: Path traversal overwrite

PyPI's validation will ensure that multiple entries with the same filename cannot exist. However, there are effectively an infinite number of ways to specify a path to the same file.

So "file.txt", "./file.txt", "././file.txt", "a/../file.txt", etc all effectively point to the same place in the filesystem.

Since PyPI's validation still allows for these variants to exist it is possible to overwrite a previously written file with different content.

UV's implementation protects against this issue by ensuring that the bytes on disk always match the bytes being written.

The PoC above uses this technique to overwrite UV's "./dist/RECORD" with the Python version.

Differential: data descriptors

In UV the data descriptor is read after the file content reaches EOF, regardless of any reported compressed size. Furthermore, if a data descriptor is present the compressed size of the Local File entry and the Central Directory are unchecked. This allows an arbitrary compressed size to be set in the Central Directory.

The PyPI validator falls back to using the compressed size reported by the Python zipfile implementation which is read from the Central Directory.

This difference in behaviour means that the processing of the Local File entries can be desynchronized.

The PoC above uses this capability to add additional Local File entries to the zip that are seen by Python, but are ignored by UV.

Differential: deflate compress size

PyPI does not check the amount of data read when inflating a compressed file entry. Any content that is after the last block of the deflated file content is ignored.

UV is strict, and ensures that the reported compressed size of the file entry matches the number of bytes read during the inflation of the file content.

The PoC above uses this ability to hide content after a deflated file to ensure that when Python is reading the zip file, it ignores any data intended for UV that would otherwise confuse the parser.

Differential: Filename extra handling

UV and the Python zipfile implementation both support the "InfoZip Unicode Path Extra Field" (ID: 0x7075) when specifying the name of the file.

However, the use of this data is inconsistently applied.

The PyPI zip validation ignores this extra field almost entirely. The only time it matters is when validating the compressed size reported by the Local File entry with the Central Directory.

UV appears to exclusively use the extra field when it is present.

The Python zipfile implementation uses the original filename when reading the Local File entry, to ensure that it matches the original filename specified in the Central Directory entry.

This difference in behaviour means that the fields can be mixed about. At the simplest level, two files can have their filenames swapped.

The PoC above uses this difference in behaviour, in conjunction with the null byte behaviour above, to skip the Python specific entries for "pkg/init.py" and "dist/RECORD" from UV.

Timeline

Date reported: 10/02/2025
Date fixed:
Date disclosed: 1/22/2026


文章来源: https://github.com/google/security-research/security/advisories/GHSA-w97x-xxj5-gpjx
如有侵权请联系:admin#unsafe.sh