How Good Is GPT-5.5 for Cybersecurity?
Written by
Brandon Veiseh
GPT-5.5 is good at cybersecurity. Good enough that how you respond to it will matter more than how impressed you are by it.
GPT-5.5 is the first OpenAI model that the company itself classifies as having "High" cybersecurity capability under its Preparedness Framework. That designation matters. It means the model is capable enough at offensive security work that OpenAI is now shipping purpose-built safeguards, a cyber-permissive variant (GPT-5.5-Cyber), and a trust-based access program just to manage how defenders and attackers can use it. So how good is it actually? We pulled the benchmarks, the third-party red team results, and the real-world pentesting data to find out.
Update, May 7, 2026: Since this article was first published, OpenAI has rolled out GPT-5.5-Cyber in limited preview to vetted defenders responsible for securing critical infrastructure, announced by OpenAI and reported by CNBC. The UK AI Security Institute has also published its formal evaluation of GPT-5.5, placing it at near-parity with Anthropic's Claude Mythos Preview on the hardest tier of its 95-task cyber benchmark. We've added a new section below covering GPT-5.5-Cyber.
What cybersecurity capabilities does GPT-5.5 actually have?
GPT-5.5 launched in April 2026 as OpenAI's most capable general-purpose model, with particular gains in agentic work, long-horizon tool use, and domain-specific reasoning. According to OpenAI's GPT-5.5 announcement, the model "understands the task earlier, asks for less guidance, uses tools more effectively, checks its work and keeps going until it's done," which happens to describe exactly what an offensive security agent needs to do when probing an application.
The capability jump over prior models is measurable. In OpenAI's official system card, GPT-5.5 outperforms every previous GPT model on both CTF challenges and vulnerability discovery benchmarks. The US Center for AI Standards and Innovation (CAISI) observed "a marginal increase in capabilities relative to GPT-5.3-codex on cyber tasks including vulnerability discovery, exploitation, and cyber target selection," while the UK AI Security Institute concluded GPT-5.5 is "the strongest performing model overall on their narrow cyber tasks."
Is GPT-5.5 better than previous models at finding real vulnerabilities?
Benchmarks are one thing. Live applications are another. The most credible external data comes from XBOW's evaluation of GPT-5.5, which runs models against open-source applications frozen at known-vulnerable versions and measures miss rate, the percentage of real CVEs the model fails to find. The progression is striking: GPT-5 missed 40% of vulnerabilities, Claude Opus 4.6 brought that down to 18%, and GPT-5.5 hit just 10%. That's a step change, not an incremental update.
It's also consistent with what XBOW documented for the previous GPT-5 release, where scaffolding the model inside an autonomous agent framework more than doubled performance versus running it in isolation. The lesson for defenders is that raw model benchmarks consistently underestimate what happens when these models are wrapped in real pentesting agents. That's the architecture MindFort's platform is built around.
Can GPT-5.5 develop zero-day exploits?
This is the question OpenAI spent the most pages addressing in the system card, and the answer is "almost, but not quite." According to the GPT-5.5 System Card, on the VulnLMP evaluation, which is designed to test end-to-end exploit chain development against real-world targets, "GPT-5.5 did not independently produce a functional full chain exploit or another verifier-confirmed Critical-level outcome." The bottleneck wasn't search breadth but "exploit development judgment: deciding which leads merited deep investment, converting crashes into controlled primitives, and ruling out diagnostic or availability-only bugs."
OpenAI classifies this as "High cybersecurity capability" but not "Critical" under their Preparedness Framework. Practically, that means GPT-5.5 can meaningfully assist skilled vulnerability researchers but cannot yet fully replace them for novel zero-day development. It's a capable junior researcher, not an autonomous exploit developer, at least not yet.
What safeguards has OpenAI added, and how do they affect security teams?
Because GPT-5.5 crossed OpenAI's "High" threshold, the release came with the strongest safeguards of any GPT launch to date. OpenAI describes training the model to refuse clearly malicious requests, plus "automated classifier-based monitors [that] detect signals of suspicious cyber activity and route high-risk traffic to a less cyber-capable model" as detailed here.
For legitimate security teams, this creates friction. Exploit reproduction in a lab, payload crafting for sanctioned engagements, and adversary emulation, all of which worked on earlier models, are now more likely to hit refusals. OpenAI's answer is the Trusted Access for Cyber program, which vets defenders and grants access to cyber-permissive variants tuned for binary reverse engineering, vulnerability research, and other advanced defensive workflows. The Hacker News reports that the program has already helped fix more than 3,000 vulnerabilities and is being expanded to thousands of individual defenders.
What is GPT-5.5-Cyber, and what can it do?
On May 7, 2026, OpenAI announced GPT-5.5-Cyber in limited preview, rolling it out through the highest tier of the Trusted Access for Cyber program to vetted defenders responsible for securing critical infrastructure. It supersedes GPT-5.4-Cyber as the cyber-permissive variant available to TAC partners.
The most important thing to understand about GPT-5.5-Cyber is what it isn't. OpenAI is explicit that the preview "is not intended to significantly increase cyber capability beyond GPT-5.5. It's primarily trained to be more permissive on security-related tasks." The lift is in what it will do, not what it can do. Defenders who previously hit refusals from the public GPT-5.5 on legitimate security work, like payload crafting for sanctioned engagements, exploit reproduction in lab environments, or binary reverse engineering of suspect samples, should see fewer refusals on GPT-5.5-Cyber.
What the model is approved for, per OpenAI:
- Vulnerability identification and triage
- Malware analysis
- Binary reverse engineering of compiled software for threat assessment
- Detection engineering
- Patch validation
- Proof-of-concept generation against authorized targets
What remains hard-blocked, even in the cyber-permissive variant:
- Credential theft
- Persistence mechanisms
- Malware deployment
- Exploitation of unauthorized third-party systems
- Stealth techniques designed to evade defensive monitoring
Access is gated, and meaningfully so. According to Axios, TAC has scaled to thousands of verified individual defenders and hundreds of teams, but only the highest tier gets GPT-5.5-Cyber. Defenders accepted into the program receive lower classifier-based refusals and are required to use phishing-resistant account security protections.
The release lands roughly a month after Anthropic's Claude Mythos Preview and its Project Glasswing program, which currently includes around 40 vetted organizations. The two labs have settled into different access philosophies, with OpenAI casting a wider net through TAC and Anthropic running a tighter consortium, but the operational reality for most security teams is the same: the most capable cyber model you can get your hands on is the one your organization gets approved for. Independent testing covered by Axios puts GPT-5.5 and Mythos at rough parity, with sources telling reporters the two are "roughly on par" and one recent test putting Mythos narrowly ahead.
Where does GPT-5.5 fall short for security work?
Three places. First, as noted in the system card, judgment: deciding which of dozens of candidate bugs is actually exploitable, and converting a crash into a weaponized primitive. Second, business logic. General-purpose models still struggle with the application-specific reasoning that distinguishes "user A viewing user B's data" from ordinary behavior, a gap we've written about in our buyer's guide. Third, consistency: benchmarks like pass@1 over multiple rollouts reveal that even GPT-5.5 and GPT-5.5 Pro perform only "slightly higher than previous models" on consistency metrics, meaning you often need multiple attempts and orchestration to get reliable results.
Are AI-powered cyberattacks actually increasing?
Yes, and the data is no longer subtle. The 2026 IBM X-Force Threat Intelligence Index reported a 44% year-over-year increase in attacks targeting public-facing applications, with IBM directly attributing the surge to "AI-enabled vulnerability discovery" that shortens the window between CVE disclosure and active exploitation. CrowdStrike's 2026 Global Threat Report is even more direct: an 89% increase in attacks by "AI-enabled adversaries" in 2025 versus the previous year, across social engineering, malware development, and reconnaissance. Models like GPT-5.5 are not a future threat. They are what is accelerating the curve right now.
What does this mean for your application security program?
The trajectory is clear. Each OpenAI release has narrowed the gap between model-assisted testing and skilled human pentesters, and GPT-5.5 is the closest yet. The problem for defenders is that the model you can call from an API is not the model doing the most capable offensive work. That happens inside purpose-built harnesses that wrap the model in memory, tool use, exploit validation, and multi-step planning. XBOW demonstrated this clearly: the same GPT-5 that looked "moderate" in isolation more than doubled its performance inside an autonomous pentesting agent. An API key and a prompt library will not close that gap.
That is why MindFort does not wrap GPT-5.5. MindFort is powered by MF-1, a custom LLM purpose-built for offensive security reasoning, not a general-purpose model wrapper, running inside our own autonomous agent harness that handles reconnaissance, exploit development, runtime validation, and patching as a single continuous loop. Our agents probe your apps, APIs, and infrastructure the way an attacker would, validate exploits in isolated environments before reporting them, and deliver each finding as a merge-ready GitHub PR with a threat model attached. It is a new category we call AXR (Autonomous Exploitation and Remediation), and it is available to deploy against your stack today, not gated behind a trust program or a waitlist.
GPT-5.5 is good at cybersecurity. Good enough that how you respond to it will matter more than how impressed you are by it. The attackers already have their harness. You should have yours.
Talk to the MindFort team about deploying autonomous security agents against your attack surface, or read our 2026 AI Pentesting Buyer's Guide for a full view of the category.
Autonomous SecurityFor Every Team. Now.
Agents find vulnerabilities and fix them for you.
Start free or talk to our team.
First results
<1 hr
Coverage
24/7
False positives
<0.1%
Remediation In
Minutes