By Alvin Crighton, Anusha Ghosh, Suha Hussain, Heidy Khlaaf, and Jim Miller
TL;DR: We identified 11 security vulnerabilities in YOLOv7, a popular computer vision framework, that could enable attacks including remote code execution (RCE), denial of service, and model differentials (where an attacker can trigger a model to perform differently in different contexts).
Open-source software provides the foundation of many widely used ML systems. However, these frameworks have been developed rapidly, often at the cost of secure and robust practices. Furthermore, these open-source models and frameworks are not specifically intended for critical applications, yet they are being adopted for such applications at scale, through momentum or popularity. Few of these software projects have been rigorously reviewed, leading to latent risks and a rise of unidentified supply chain attack surfaces that impact the confidentiality, integrity, and availability of the model and its associated assets. For example, pickle files, used widely in the ML ecosystem, can be exploited to achieve arbitrary code execution.
Given these risks, we decided to assess the security of a popular and well-established vision model: YOLOv7. This blog post shares and discusses the results of our review, which comprised a lightweight threat model and secure code review, including our conclusion that the YOLOv7 codebase is not suitable for mission-critical applications or applications that require high availability. A link to the full public report is available here.
Disclaimer: YOLOv7 is a product of academic work. Academic prototypes are not intended to be production ready nor have appropriate cyber hygiene, and our review is not intended as a criticism of the authors nor their development choices. However, as with many ML prototypes, they have been adopted within production systems (e.g., as YOLOv7 is promoted by Roboflow, with 3.5k forks). Our review is only intended to bring to light the risks in using such prototypes without further security scrutiny.
As part of our responsible disclosure policy, we contacted the authors of the YOLOv7 repository to make them aware of issues identified. We did not receive a response, but we propose concrete solutions and changes that would mitigate the identified security gaps.
You Only Look Once (YOLO) is a state-of-the-art, real-time object detection system whose combination of high accuracy and good performance has made it a popular choice for vision systems embedded in mission-critical applications such as robotics, autonomous vehicles, and manufacturing. YOLOv1 was initially developed in 2015; its latest version, YOLOv7, is the open-source codebase revision of YOLO developed by Academia Sinica that implements their corresponding academic paper, which outlines how YOLOv7 outperforms both transformer-based object detectors and convolutional-based object detectors (including YOLOv5).
The codebase has over 3k forks and allows users to provide their own pre-trained files, model architecture, and dataset to train custom models. Even though YOLOv7 is an academic project, YOLO is the de facto algorithm in object detection, and is often used commercially and in mission-critical applications (e.g., by Roboflow).
Our review identified five high-severity and three medium-severity findings, which we attribute to the following insecure practices:
The table below summarizes our high-severity findings:
**It is common practice to download datasets, model pickle files, and YAML configuration files from external sources, such as PyTorch Hub. To compromise these files on a target machine, an attacker could upload a malicious file to one of these public sources.
For our review, we first carried out a lightweight threat model to identify threat scenarios and the most critical components that would in turn inform our code review. Our approach draws from Mozilla’s “Rapid Risk Assessment” methodology and NIST’s guidance on data-centric threat modeling (NIST 800-154). We reviewed YOLO academic papers, the YOLOv7 codebase, and user documentation to identify all data types, data flow, trust zones (and their connections), and threat actors. These artifacts were then used to develop a comprehensive list of threat scenarios that document each of the possible threats and risks present in the system.
The threat model accounts for the ML pipeline’s unique architecture (relative to traditional software systems), which introduces novel threats and risks due to new attack surfaces within the ML lifecycle and pipeline such as data collection, model training, and model inference and deployment. Corresponding threats and failures can lead to the degradation of model performance, exploitation of the collection and processing of data, and manipulation of the resulting outputs. For example, downloading a dataset from an untrusted or insecure source can lead to dataset poisoning and model degradation.
Our threat model thus aims to examine ML-specific areas of entry as well as outline significant sub-components of the YOLOv7 codebase. Based on our assessment of YOLOv7 artifacts, we constructed the following data flow diagram.
Note that this diagram and our threat model do not target a specific application or deployment environment. Our identified scenarios were tailored to bring focus to general ML threats that developers should consider before deploying YOLOv7 within their ecosystem. We identified a total of twelve threat scenarios pertaining to three primary threats: dataset compromise, host compromise, and YOLO process compromise (such as injecting malicious code into the YOLO system or one of its dependencies).
Next, we performed a secure code review of the YOLOv7 codebase, focusing on the most critical components identified in the threat model’s threat scenarios. We used both manual and automated testing methods; our automated testing tools included Trail of Bits’ repository of custom Semgrep rules, which target the misuse of ML frameworks such as PyTorch and which identified one security issue and several code quality issues in the YoloV7 codebase. We also used the TorchScript automatic trace checking tool to automatically detect potential errors in traced models. Finally, we used the public Python CodeQL queries across the codebase and identified multiple code quality issues.
In total, our code review resulted in the discovery of twelve security issues, five of which are high severity. The review also uncovered twelve code quality findings that serve as recommendations for enhancing the quality and readability of the codebase and preventing the introduction of future vulnerabilities.
All of these findings are indicative of a system that was not written or designed with a defensive lens:
Below, we highlight some of the details of our high severity findings and discuss their repercussions on ML-based systems.
Our most notable finding regards the insecure parsing of YAML files that could result in RCE. Like many ML systems, YOLO uses YAML files to specify the architecture of models. Unfortunately, the YAML parsing function, parse_model, parses the contents of the file by calling eval on unvalidated contents of the file, as shown in this code snippet:
If an attacker is able to manipulate one of these YAML files used by a target user, they could inject malicious code that would be executed during this parsing. This is particularly concerning since these YAML files are often obtained from third-party websites that host these files along with other model files and datasets. A sophisticated attacker could compromise one of these third-party services or hosted assets. However, this issue can be detected through proper inspection of these YAML files, if done closely and often.
Given the potential severity of this finding, we proposed an alternative implementation as a mitigation: remove the need for the parse_model function altogether by rewriting the given architectures defined in config files as different block classes that call standard PyTorch modules. This rewrite serves a few different purposes:
Our proposed fix can be tracked here.
As previously noted, ML frameworks are leading to a rise of novel attack avenues targeting confidentiality, integrity, and availability of the model and its associated assets. Highlights from the ML-specific issues that we uncovered during our security assessment include the following:
Specific recommendations and mitigations are outlined in our report.
Beyond the identified code review issues, a series of design and operational changes are needed to ensure sufficient security posture. Highlights from the list of strategic recommendations provided in our report include:
Although the identified security gaps may be acceptable for academic prototypes, we do not recommend using YOLOv7 within mission-critical applications or domains, despite existing use cases. For affected end users who are already using and deploying YOLOv7, we strongly recommend disallowing end users from providing datasets, model files, configuration files, and any other type of external inputs until the recommended changes are made to the design and maintenance of YOLOv7.
As part of the disclosure process, we reported the vulnerabilities to the YOLOv7 maintainers first. Despite multiple attempts, we were not able to establish contact with the maintainers in order to coordinate fixes for these vulnerabilities. As a result, at the time of this blog post being released, the identified issues remain unfixed. As mentioned, we have proposed a fix to one of the issues that is being tracked here. The timeline of disclosure is provided below:
*** This is a Security Bloggers Network syndicated blog from Trail of Bits Blog authored by Trail of Bits. Read the original post at: https://blog.trailofbits.com/2023/11/15/assessing-the-security-posture-of-a-widely-used-vision-model-yolov7/