AI cracked a 15-year physics problem — but only OpenAI's friends can use it

Contents

The formula physicists ignored for a decade took AI half a workday

Only OpenAI’s friends got to use the AI that actually worked

The verification problem nobody wants to talk about

AI solved a 15-year physics puzzle in 12 hours. But the AI that did it? You can’t use it.

OpenAI’s GPT-5.2 spent half a workday deriving a formula for gluon scattering amplitudes that theoretical physicists assumed were always zero—until February 13, 2026, when a preprint dropped proving them wrong.

The result: a “strikingly simple” closed-form equation that replaces superexponentially growing Feynman diagram calculations with a few-product line. Nima Arkani-Hamed, who’d pondered these processes for 15 years, called it a breakthrough. Harvard, Cambridge, the Institute for Advanced Study, and Vanderbilt verified it within days using Berends-Giele recursion and soft theorem checks.

The catch: the AI version that spent 12 hours reasoning through the proof—an internal scaffolded system OpenAI calls its “Super Chat” model—isn’t available to researchers outside the company’s approved collaborator list. Public GPT-5.2 Pro made the initial conjecture. The locked-down version proved it.

The formula physicists ignored for a decade took AI half a workday

Before February 2026, hand calculations maxed out at n=6 gluons—the particles that glue quarks together inside protons. Push beyond that, and the math explodes into dozens of terms per amplitude. Physicists assumed single-minus helicity amplitudes (one gluon spinning negative, the rest positive) were zero at tree level for generic momenta, so they ignored them. They were wrong.

In the half-collinear regime—a special momentum alignment where particles nearly overlap—those “zero” amplitudes are nonzero. GPT-5.2 spotted the pattern humans missed, simplifying n=6 expressions from 32-term monsters into compact products. Then it conjectured a general formula valid for all n. The scaffolded version spent 12 hours proving it.

When the preprint dropped, physicists swung from skepticism to awe—UC professor Nathaniel Craig called it “clearly journal-level research” within hours. Arkani-Hamed, initially curious whether AI could handle theoretical physics at all, admitted the formulas were simpler than anything he’d derived by hand. Andrew Strominger at Harvard, who doubted AI’s utility for fundamental research, now says it will “empower us to do more.”

Only OpenAI’s friends got to use the AI that actually worked

The two-tier system is the real story. Public GPT-5.2 Pro made the conjecture. The internal scaffolded version—what OpenAI’s collaborators call “Super Chat”—spent 12 hours proving it. Who got access? Strominger, Radu Lupsasca, David Skinner, Alfredo Guevara, Arkani-Hamed. All named authors on the preprint. All at elite institutions.

Everyone else? Locked out.

OpenAI hasn’t disclosed access criteria as of March 2026. No pricing. No institutional requirements. No waitlist numbers. The breakthrough relies on proprietary black-box models unavailable to independent researchers, locking out anyone without an invitation to OpenAI’s private Slack.

This isn’t the first attempt. ChatGPT reportedly fumbled a similar problem “last spring” in 2025. Something changed between the public model and the scaffolded version—but OpenAI won’t say what. The collaborators got the upgraded tool. The rest of the physics community got a blog post.

The verification problem nobody wants to talk about

Humans verified the result. That’s not the same as replicating the discovery process.

Berends-Giele recursion and soft theorem checks confirmed GPT-5.2’s formula matches known physics. But independent researchers can’t reproduce the 12-hour reasoning session because the scaffolded version doesn’t exist outside OpenAI’s walls. We can check the answer. We can’t reproduce the method.

Journals require reproducibility. This isn’t reproducible. Craig’s “journal-level research” assessment highlights the irony—the result meets publication standards, but the tool that generated it violates the scientific method’s core principle: anyone should be able to repeat the experiment. The same verification problems plaguing AI security research now apply to theoretical physics.

Lupsasca plans quantum gravity applications “by the end of the year.” But who gets to run those experiments? The researchers with OpenAI invitations, or the ones with the best ideas?

Strominger says AI will “empower us to do more.” The unnamed reality: empowerment requires invitation.