There are significant security concerns in the deployment of leading large language models (LLMs), according to a study from U.K. AI Safety Institute (AISI). The takeaway: The built-in safeguards in five LLMs released by major labs are ineffective.
The AISI study used the institute’s open source evaluation framework, Inspect, to assess the anonymized models on compliance, correctness, and completion of responses. It examined several areas, including the potential for models to facilitate cyber-attacks and their ability to provide expert-level knowledge in chemistry and biology that could be misused.
The study also assessed LLMs’ capacity to operate autonomously in ways that might be difficult for humans to control, as well as the models’ vulnerability to “jailbreaks” where users attempt to bypass safeguards to elicit harmful outputs, such as illegal or toxic content.
“All tested LLMs remain highly vulnerable to basic jailbreaks, and some will provide harmful outputs even without dedicated attempts to circumvent their safeguards,” the report noted. “We found that models comply with harmful questions across multiple datasets under relatively simple attacks, even if they are less likely to do so in the absence of an attack.”
For Liav Caspi, CTO of Legit Security, the most concerning findings from AISI’s study were the overall understanding that an LLM model is not a safe tool and that it easily can be repurposed for malicious activities.
Most organizations want to innovate fast and be more productive, so they rush to use AI. But organizations are not sure what the security risks are or how to secure LLMs. “It is striking how easy it is to jailbreak and make a model bypass safeguards,” Caspi said.
Caspi said IT security leaders can effectively craft strategies and arguments for better security around LLMs by starting with basic hygiene. “Control the usage of AI, in the sense that you know who uses it and how,” he said. “This report is a great source to prove that using AI and LLMs, especially if offering them to consumers, pose risk to the business.”
After establishing visibility, organizations need to ensure there is a safe interaction with the AI and proper controls are in place.
“An organization that lets developers create code with AI models needs to ensure the model and service used does not compromise sensitive organization data, and that proper guardrails like human reviewer and static code analysis run on the auto-generated code,” Caspi said.
Many organizations are still not taking the security risks behind GenAI seriously, cautioned Gal Ringel, CEO at Mine, just as few took privacy risks seriously even after the passage of the GDPR in Europe and CCPA in California. “Some genuinely are failing to take action because they don’t know what to do, but LLMs are not digital playthings, so even this approach is risky,” he said.
Ringel noted that LLMs already have Google’s “influence and ubiquity .” Failure to account for security risks will lead to a more degraded internet.
Use awareness and oversight to ensure AI technologies are deployed securely. “Whether you’re training in a structured or unstructured environment, developers have a duty to oversee training, deploy both internal and external feedback loops, and go back to the drawing board if systems prove to be vulnerable to attacks and jailbreaks,” Ringel said.
Organizations must also ensure every staff member knows the possible dangers, how to best use LLMs, and not to rely on them for public-facing outputs, be that code, text, or images. “A vigilant organization is far less likely to fall prey to privacy or security harms than an oblivious one,” Ringel said.
Recent Articles By Author