OpenAI’s GPT-5.5 Instant matches frontier models for health queries with 52.5% fewer hallucinations

OpenAI’s GPT-5.5 Instant matches frontier models for health queries with 52.5% fewer hallucinations

The new default ChatGPT model cuts hallucinated medical claims by more than half while scoring within range of OpenAI's most powerful reasoning systems on health benchmarks

OpenAI just made its most medically literate AI model the default for every ChatGPT user on the planet. GPT-5.5 Instant, which launched on May 5 as a replacement for GPT-5.3 Instant, now matches the company’s frontier reasoning models on health-related queries, a category where getting things wrong carries real consequences.

The numbers behind the health benchmark gains

On HealthBench, OpenAI’s internal evaluation suite for medical accuracy, GPT-5.5 Instant scored between 49.6 and 51.4 across variants. That represents a 1.8-point improvement on the overall score compared to its predecessor, with a more dramatic 5.5-point advantage on professional-grade medical queries.

Advertisement

The hallucination reduction is the headline stat, though. OpenAI recorded a 52.5% decrease in hallucinated claims on high-stakes prompts spanning medical, legal, and financial topics.

User-flagged factual errors also dropped by 37.3%, suggesting the improvements aren’t just visible in controlled benchmarks. Real people using the product are encountering fewer moments where the model confidently states something that simply isn’t true.

What changed under the hood

The broader GPT-5.5 family debuted on April 23, designed around advanced reasoning capabilities. GPT-5.5 Instant is the lightweight, fast-response variant built for everyday use, not the full-fat reasoning model meant for complex multi-step problems.

Context window sizes tell a story about OpenAI’s tiering strategy. Free users get 16K tokens of context. Premium subscribers can access up to 128K tokens, which is enough to process lengthy medical records, research papers, or multi-page legal documents in a single conversation. Paid users also get temporary access to the older GPT-5.3 model during the transition period.

Enhanced personalization features are rolling out alongside the model swap, though OpenAI appears to be staggering these additions rather than shipping everything at once. The core upgrade, the model itself, is available immediately across both free and paid tiers.

Disclosure: This article was edited by Editorial Team. For more information on how we create and review content, see our Editorial Policy.

OpenAI’s GPT-5.5 Instant matches frontier models for health queries with 52.5% fewer hallucinations

OpenAI’s GPT-5.5 Instant matches frontier models for health queries with 52.5% fewer hallucinations

The new default ChatGPT model cuts hallucinated medical claims by more than half while scoring within range of OpenAI's most powerful reasoning systems on health benchmarks

OpenAI just made its most medically literate AI model the default for every ChatGPT user on the planet. GPT-5.5 Instant, which launched on May 5 as a replacement for GPT-5.3 Instant, now matches the company’s frontier reasoning models on health-related queries, a category where getting things wrong carries real consequences.

The numbers behind the health benchmark gains

On HealthBench, OpenAI’s internal evaluation suite for medical accuracy, GPT-5.5 Instant scored between 49.6 and 51.4 across variants. That represents a 1.8-point improvement on the overall score compared to its predecessor, with a more dramatic 5.5-point advantage on professional-grade medical queries.

Advertisement

The hallucination reduction is the headline stat, though. OpenAI recorded a 52.5% decrease in hallucinated claims on high-stakes prompts spanning medical, legal, and financial topics.

User-flagged factual errors also dropped by 37.3%, suggesting the improvements aren’t just visible in controlled benchmarks. Real people using the product are encountering fewer moments where the model confidently states something that simply isn’t true.

What changed under the hood

The broader GPT-5.5 family debuted on April 23, designed around advanced reasoning capabilities. GPT-5.5 Instant is the lightweight, fast-response variant built for everyday use, not the full-fat reasoning model meant for complex multi-step problems.

Context window sizes tell a story about OpenAI’s tiering strategy. Free users get 16K tokens of context. Premium subscribers can access up to 128K tokens, which is enough to process lengthy medical records, research papers, or multi-page legal documents in a single conversation. Paid users also get temporary access to the older GPT-5.3 model during the transition period.

Enhanced personalization features are rolling out alongside the model swap, though OpenAI appears to be staggering these additions rather than shipping everything at once. The core upgrade, the model itself, is available immediately across both free and paid tiers.

Disclosure: This article was edited by Editorial Team. For more information on how we create and review content, see our Editorial Policy.