Inside CVE-2024-4367: Arbitrary JavaScript Execution via PDF.js

Inside CVE-2024-4367: Arbitrary JavaScript Execution via PDF.js
好的，我需要帮用户总结这篇文章的内容，控制在100字以内，而且不需要特定的开头。首先，文章讲的是PDF.js这个开源库的漏洞，影响版本是4.2.67之前。漏洞的CVSS评分很高，8.8分，说明影响大且容易被利用。接下来，文章解释了漏洞的原因。在渲染PDF时，字体矩阵（FontMatrix）没有被正确验证，允许注入JavaScript代码。攻击者可以通过构造恶意PDF文件来执行任意代码，导致敏感数据泄露或其他安全问题。然后，作者提到这个漏洞在2024年发现，在2026年仍然相关，因为很多应用还在使用旧版本的库。此外，AI工具的普及让开发者对代码库不够熟悉，增加了漏洞被忽视的风险。最后，文章给出了修复建议：升级到最新版本、禁用动态代码执行、严格验证输入以及沙盒渲染PDF。总结一下，文章主要讲述了PDF.js的一个高危漏洞及其影响和修复方法。 </think> 这篇文章讨论了开源库PDF.js中的一个高危漏洞（CVSS评分8.8），该漏洞存在于4.2.67之前的版本中。攻击者可通过构造恶意PDF文件，在渲染时注入JavaScript代码以执行任意操作。尽管该漏洞于2024年发现，但因大量应用仍在使用旧版本库且AI开发趋势导致代码审查不足，在2026年仍具威胁性。建议升级到最新版本或采取其他安全措施以缓解风险。 2026-4-2 06:56:54 Author: infosecwriteups.com(查看原文) 阅读量:42 收藏

Context

PDF.js as the name suggests is a javascript open source library. This library is primarily used for parsing and rendering PDFs in web browsers & desktop applications.

I have been able to find this vulnerability in a number of applications over the past 1 year. Although these applications serve completely different purposes, all of them interact with PDFs. The vulnerability exists for versions prior to 4.2.67. This vulnerability has a high CVSS score of 8.8, which in simple terms means that this vulnerability can have massive impact and is comparatively easier to exploit.

Why is a 2024 web application vulnerability still relevant in 2026?
The number of web applications on the public internet was already insane. Then comes all the big AI players and you can just put into context how vibe-coding has become so mainstream that everyone from junior developers to senior developers is riding the wave and writing prompts and not the code.

Not all web applications interact with PDFs, however, the subset isn’t small. If we go to the npm’s website, we can see over 12 Million weekly downloads, however, we can make a guess that the number of applications that will already be using the exploitable version is not small.

Press enter or click to view image in full size

Number of weekly downloads for PDF.js (at the time of writing)

The Problem

The major reason, why you might find this vulnerability in a lot more applications is that web application owners already using the library are not aware about their version being vulnerable because no one has tested it on their end.

The rise of AI has also led to reduced familiarity with codebase. There are quite a lot of developers who would be unaware of the libraries used by their application.

Additionally, tools like Github Copilot were initially trained on a large dataset of publicly available source code (including public repositories from Github). You never know if the code you are generating from AI is a safe option.

Another interesting point is that there are a number of other libraries for certain frameworks that use the PDF.js implementation internally, which can lead to an indirect vulnerability.

Vulnerability Explainer

Coming to the most fun part, let’s talk about where the vulnerability lies.

In simple terms, before any PDF is rendered (displayed), internally a font definition is used to construct the font for whatever text is present within the PDF file. The problem lies within an edge case where this font definition is not validated for structure but whatever font definition is supplied is directly used, which allows javascript execution.

If we go deep, there is a matrix called FontMatrix that defines things like size, position and rotation of characters. This matrix is 6 element array. The assumption here is that FontMatrix will always contain numbers but since there is no validation, it also accepts non-numeric values.

Taking an example here, if you open a PDF in a text editor, you may find FontMatrix looking something like (the values may differ):

/FontMatrix [.00048828125 0 0 -.00048828125 0 0]

However, the vulnerable version of the library allows something like the following, where test_line acts like a string:

/FontMatrix [1 2 3 4 5 (test_string)]

The problematic part is the implementation directly embeds these values into dynamically generated JavaScript code.

It is a 2 step process for JS to create and then execute:

getPathGenerator() -> This is the function responsible for building up a JS string, which contains the parameter passed directly in FontMatrix . In simpler terms, it creates a string containing the transform() function, which is executed in the next step.
Dynamic JavaScript Execution -> This step is responsible for executing the generated JS code, including the transform() function with the provided parameters.

So, when the above FontMatrix is processed, it will be passed to transform function as:

c.transform(1,2,3,4,5,test_string);

Now, if we know JS, we can manipulate this further, a simple example will be using something else as the last parameter of FontMatrix:

(0\); alert(1); //

(0 -> 0 will replace and fill in the place of 6th element in the array, this makes sure that we don’t mess the actual syntax.

Get Vansh’s stories in your inbox

Join Medium for free to get updates from this writer.

Remember me for faster sign in

\) -> \ helps prevent ) from being treated as a string terminator and ensures that we treat ) as data. This ensures that all the following JS code we want to inject is preserved, not truncated and avoids syntax errors .

; -> ends the current statement of the function that will contain 6 parameters ( transform function here)

alert(1); -> becomes a separate statement and can execute independently. The ; here, makes sure that even if there is more code, our injected code runs just fine.

// -> to comment out anything after the current line.

The above code becomes:

c.transform(1,2,3,4,5,0); alert(1); // anything_here_is_commented_out

Exploitation

This works because attacker controlled input is embedded directly into executable JavaScript without validation or proper handling.

If you create a PDF with such a code and upload it to a vulnerable application, it will execute JavaScript when the PDF is opened/rendered.

Example:

Here our target was a college website open for application forms.

Press enter or click to view image in full size

We uploaded a dummy PDF file with the following FontMatrix :

Press enter or click to view image in full size

/FontMatrix [ 1 2 3 4 5 (1\); alert\('origin: '+window.origin+', pdf url: '+(window.PDFViewerApplication?window.PDFViewerApplication.url:document.URL)\)) ]

The above payload retrieves both the origin and the URL of the loaded PDF document.

When we open the uploaded PDF file, it is executed as:

Press enter or click to view image in full size

Some of the critical attack vectors:

This can be used to access sensitive data such as authentication tokens, which can lead to session hijacking.
Certain applications may allow attackers to perform actions on behalf of the user.
Data accessible within the application can be extracted and sent to an attacker’s server.

Patching

Upgrade to a patched version: Upgrading to version 4.2.67 or later will resolve the issue.

In case upgrading the library introduces breaking changes:

Disable dynamic code execution: Setting isEvalSupported to false helps prevent execution of dynamically generated JavaScript.
Strict Input Validation for FontMatrix : Ensure FontMatrix values are strictly validated as numeric types before use.
Sandbox PDF rendering: Render PDFs in an isolated or sandboxed environment i.e. using iframe with restrictions as an example.

文章来源: https://infosecwriteups.com/inside-cve-2024-4367-arbitrary-javascript-execution-via-pdf-js-abb4afff4141?source=rss----7b722bfd1b8d---4
如有侵权请联系:admin#unsafe.sh