DeepSeek AI found to be stunningly vulnerable to jailbreaking

Researchers have pitted DeepSeek's R1 model against several harmful prompts and found it's particularly susceptible to jailbreaking.

DeepSeek AI found to be stunningly vulnerable to jailbreaking
Comment IconFacebook IconX IconReddit Icon
Tech and Science Editor
Published
1 minute & 30 seconds read time
TL;DR: It was unable to block any harmful prompts, achieving a 100% attack success rate, highlighting significant safety and security shortcomings compared to established AI models.

When DeepSeek unveiled its R1 model the AI industry reeled as the company claimed it had developed an AI model that's on par with OpenAI's most-sophisticated model, but for a fraction of the cost.

DeepSeek AI found to be stunningly vulnerable to jailbreaking 655616

But now the AI model has been out for some time, security researchers have been playing around with it and comparing it against the competition. In one set of testing, researchers from the University of Pennsylvania and hardware conglomerate Cisco pitted DeepSeek's AI against some "malicious" prompts, which are designed to bypass AI guidelines that are designed to prevent users from acquiring knowledge on how to, for example, make a bomb, generate misinformation, conduct cybercrime activities, etc.

Bypassing regulatory guidelines of a device typically called "jailbreaking," and in the instance of DeepSeek's AI, the researchers found it "failed to block a single harmful prompt." The R1 model was pitted against "50 random prompts from the HarmBench dataset," and the researchers were surprised to achieve a "100 percent attack success rate." According to the blog post, the researchers say the R1 model test results contrast starkly against other established AI models from OpenAI, Google, and Microsoft.

"A hundred percent of the attacks succeeded, which tells you that there's a trade-off. Yes, it might have been cheaper to build something here, but the investment has perhaps not gone into thinking through what types of safety and security things you need to put inside of the model," said DJ Sampath, the VP of product, AI software and platform at Cisco, tells WIRED

"Every single method worked flawlessly. What's even more alarming is that these aren't novel 'zero-day' jailbreaks-many have been publicly known for years," said Alex Polyakov, the CEO of security firm Adversa AI, in an email to WIRED

Photo of the Assassin's Creed Valhalla PS4
Best Deals: Assassin's Creed Valhalla PS4
Country flag Today 7 days ago 30 days ago
$25.75 USD $26.93 USD
Buy
$24.95 USD $24.95 USD
Buy
$34.41 CAD $44.01 CAD
Buy
£22 £30.12
Buy
$25.75 USD $26.93 USD
Buy
* Prices last scanned on 2/18/2025 at 3:23 am CST - prices may not be accurate, click links above for the latest price. We may earn an affiliate commission from any sales.
NEWS SOURCE:wired.com

Tech and Science Editor

Email IconX IconLinkedIn Icon

Jak joined the TweakTown team in 2017 and has since reviewed 100s of new tech products and kept us informed daily on the latest science, space, and artificial intelligence news. Jak's love for science, space, and technology, and, more specifically, PC gaming, began at 10 years old. It was the day his dad showed him how to play Age of Empires on an old Compaq PC. Ever since that day, Jak fell in love with games and the progression of the technology industry in all its forms.

Related Topics

Newsletter Subscription