OpenAI doubled its API pricing this week, released a model that scores 82.7% on Terminal Bench and tops every major intelligence composite, then watched its CEO call his main competitor a bomb salesman on a podcast. Anthropic, meanwhile, confirmed that unauthorized users accessed Mythos, the model they spent weeks marketing as too dangerous to release. Four things actually changed this week. Three of them were worth the hype.

GPT 5.5 Stopped Asking Me for Instructions

GPT 5.5 is live inside ChatGPT and Codex for Plus, Pro, Business, and Enterprise users. The benchmark story is straightforward: 82.7% on Terminal Bench, clearing Mythos at 82%, Claude Opus at 69.4%, and GPT 5.4 at 75%. On the Artificial Analysis Intelligence Index, a composite of ten benchmarks across reasoning, coding, science, and agents, 5.5 is now the standalone leader after a three-way tie between Opus 4.7, Gemini 3.1 Pro, and GPT 5.4.

The benchmark I care about is harder to quantify. Before this week, getting useful output on a complex multi-step task required laying out every step in advance, anticipating edge cases, and checking the model’s work at each turn. With GPT 5.5, I handed it a research brief and trusted it to carry the heavy lifting without micromanaging every step. That freed me up to focus on bigger-picture strategy instead of the process. Nothing in a Terminal Bench score captures that, but it is what you notice within the first hour.

Pricing doubled from GPT 5.4: $5 per million input tokens and $30 per million output tokens, up from $2.50 and $15. OpenAI claims it completes the same tasks with significantly fewer tokens. If you use the API, verify that math before assuming the upgrade is cost-neutral. As of this week, the API is not available. The model runs in ChatGPT and Codex. It is not in Cursor or any third-party coding tool yet.

ChatGPT Images 2.0: Text Works Now

The image model behind the Studio Ghibli trend got a major update. ChatGPT Images 2.0 renders accurate, readable text inside generated images consistently, which the previous version could not do.

On LM Arena, where users rank outputs in blind head-to-head comparisons without knowing which model produced them, GPT Image 2 scored 1500. Nano Banana (Gemini Flash Image) is at 1271. Every other model clusters in the 1100s to 1200s. That gap is wide enough to mean something.

What it means in practice: no more soul-crushing rounds of manual cleanup on images with text. Working with this model now feels like a precise digital canvas where what I describe comes back as what I described. Independent testers this week documented it generating barcodes that physically scan to the correct product when photographed, 10-by-10 data grids with accurate content in every cell, and dense multi-column magazine layouts with fully readable text. These are testable, specific results.

It runs directly inside ChatGPT now. No separate tool needed.

Claude Design: What You Can Actually Click Through

Anthropic released Claude Design inside claude.ai this week. It runs on Opus 4.7 and is available on Pro, Max, Team, and Enterprise plans.

What it actually does: you describe an abstract idea and it returns something interactive. You can click through it, navigate between sections, and test whether the structure makes sense before writing a line of code or opening a design tool. I went from describing a concept to clicking through a working prototype in the same session. That is a different experience from every other AI design tool I have used.

The animation capability surprised me. It generates motion graphics from plain language prompts: bar charts that animate in, line graphs that build, text that appears on cue. Basic compared to After Effects, but output that used to take an hour now takes a few minutes of prompting. For a quick animated chart or title sequence, it covers the work.

One real limitation: the aesthetic it defaults to is recognizable. Dark mode, clean lines, ticker bars. Multiple people using this tool end up with outputs that look related to each other. For early iteration and prototyping, that is fine. For a distinct visual identity, you need to push it explicitly and repeatedly.

The Altman Quote and Why I Am Tired of This Conversation

Anthropic spent weeks announcing that Mythos was too dangerous to release. Unauthorized users accessed it anyway. On the Core Memory podcast, Sam Altman responded without naming the company directly:

“It is clearly incredible marketing to say, we have built a bomb. We are about to drop it on your head. We will sell you a bomb shelter for $100 million.”

He was not subtle.

The honest read: Altman’s framing is accurate, and it is also exactly what his own company does from the other direction. Anthropic is building a brand narrative around caution. OpenAI is building a brand narrative around productivity and speed. Both narratives serve business interests. Neither company acknowledges that openly. Altman called Anthropic’s version a grift without pausing to note that “we are the responsible accelerationists” is the same move with different language.

If your model genuinely concerns you enough to withhold it, announcing it publicly and spending months describing it as the most dangerous thing ever built is not a safety decision. It is a marketing decision. The fact that unauthorized users accessed it does not change that calculus. It confirms it.

What it leaves you with is this: the models are real, the capability gains are real, and the people building them are running a business like every other business. Holding all three of those things at once is the only way to follow this industry without getting lost in the theater of it.

Written by Mario Martinez Jr. (ku5e / Gary7) | TryHackMe Profile | ku5e.com/blog