← Back to Blog
AI/ML News

Researchers Jailbreak GPT-5 in 24 Hours, Exposing AI Security Flaws

Sean Breeden August 9, 2025 10 min read
Researchers Jailbreak GPT-5 in 24 Hours, Exposing AI Security Flaws

Two separate red team evaluations have exposed serious vulnerabilities in OpenAI’s newly released GPT-5, showing that its safeguards can be bypassed with ease. NeuralTrust successfully jailbroke the model within 24 hours using a “storytelling” method, guiding GPT-5 into providing illicit instructions without triggering its guardrails. This approach works by seeding a subtle, malicious context in an ongoing conversation and steering the AI step-by-step toward the target outcome while avoiding obvious prompts that would cause refusals. SPLX, using a different strategy, found the raw GPT-5 “nearly unusable” for enterprise use, noting that obfuscation attacks such as character-splitting and fake encryption challenges still reliably work.



The results raise concerns about GPT-5’s readiness for high-stakes environments. SPLX’s tests even had the model openly give bomb-making instructions under certain conditioning prompts, and both teams concluded that GPT-5 remains far more vulnerable to multi-turn and obfuscation attacks than GPT-4o when properly hardened. The findings underscore a broader challenge for AI safety: filtering prompts in isolation is not enough when attackers can exploit the full conversation history. Enterprises looking to deploy GPT-5 in production should be aware of these weaknesses and apply additional security layers before rollout.


Source: https://www.securityweek.com/red-teams-breach-gpt-5-with-ease-warn-its-nearly-unusable-for-enterprise/

About the Author

Sean Breeden is a Full Stack Developer specializing in Mage-OS, Shopify, Magento, PHP, Python, and AI/ML. With years of experience in e-commerce development, he helps businesses leverage technology to create exceptional digital experiences.