This Week in AI: Anthropic vs Pentagon & OpenAI GPT 5.4

A three-panel illustration representing AI industry chaos: a gavel smashing a block symbolizing Anthropic's lawsuit, a digital chat interface shattering into glowing blue particles for OpenAI's pivot, and a bright flame representing open-source momentum, all connected by a glowing orange line.

Something shifted this week. Not in one place — everywhere at once. Anthropic's Pentagon standoff stopped being a negotiation and became a lawsuit, a leaked memo, and somehow still an ongoing conversation with the people who just blacklisted them. OpenAI shipped its most consequential model of the year and watched ChatGPT uninstalls jump 295% over the same weekend. And while both companies were managing crises of their own making, open source had one of its best weeks in months — native audio video, two new image editing champions, a model that runs on your phone without internet, and a speed multiplier that makes everything you already use 3.5 times faster. Any one of those would have led the week in a quieter one.

The Anthropic Saga: War, Leaks, Lawyers, and a U-Turn

Last week we left Anthropic at the Friday deadline, unmoved. This week it got worse before it got better, and it hasn't finished getting better yet.

The Designation Lands

The Pentagon made it official. Anthropic is designated a supply chain risk — the first US company in history to receive a classification previously reserved for foreign adversaries. Any company with government ties is prohibited from using Anthropic's products in Pentagon-related work, by executive order, effective immediately. Anthropic announced it would challenge the designation in court. Four days later they were quietly back in talks with the same people. Make of that what you will.

The Memo Nobody Was Supposed to See

While the legal challenge was being filed, someone leaked Anthropic's internal communications from the week the standoff peaked. Dario called it a message written in the heat of the moment. It was also the most honest thing anyone published about this situation all week:

"The real reasons the Department of War and the Trump administration do not like us: we haven't donated to Trump while OpenAI's Greg Brockman has donated a lot. We haven't given dictator-style praise to Trump while Sam has. We have supported AI regulation which is against their agenda. We've told the truth about AI policy issues like job displacement. And we've actually held our red lines with integrity rather than colluding with them to produce safety theater."

Dario walked it back within 24 hours. The memo said what it said. The red lines were real. The politics around them were messier than the public statements ever acknowledged. Both things have been true the whole time.

The Part Where Anthropic Won Anyway

ChatGPT uninstalls jumped 295% over the weekend the designation went public. Claude went from outside the top ten to number one in the App Store. Anthropic announced it is approaching a $20 billion annual revenue run rate, more than doubling from last year. A Ramp graph tracking business AI spend showed OpenAI dominant throughout 2025, Anthropic barely visible — flipped entirely by February.

Anthropic got blacklisted, filed a lawsuit, leaked an embarrassing internal memo, and ended the week as the fastest-growing AI company by revenue. There is no tidy narrative that contains all of those facts. That is what actually happened.

They also made it frictionless to leave ChatGPT without ever saying OpenAI's name. Direct import of ChatGPT memories into Claude. Claude memory free on the basic plan. TechCrunch published a step-by-step switching guide unprompted. Claude said nothing publicly. They didn't need to.

The OpenAI Problem

On the day the designation dropped, OpenAI signed the Pentagon contract — announcing the same red lines Anthropic had drawn, plus a third. Same principles, better negotiators. That was the read.

Three days later Bloomberg reported that Altman had told OpenAI staff the Pentagon "does not want the company to express opinions about whether certain military actions were good or bad ideas." That is not a red line. That is the absence of one.

Sophisticated positioning and straightforward contradiction are not mutually exclusive. This week's evidence suggests OpenAI is managing both simultaneously, and finding it expensive.

GPT 5.4: Built for Agents, Not for You

OpenAI shipped two models this week. GPT 5.3 Instant came first — a behaviour update, not a capability upgrade. Fewer unnecessary caveats, less moralising, more direct answers. OpenAI described the goal as "getting less cringe." The bar they are clearing is one they set themselves.

Two days later, GPT 5.4. Real changes — but pointed in one direction.

What Changed and Who It's For

Native computer use is built directly into the model — no separate system required. Tool search now fetches definitions on demand rather than loading every tool upfront; for anyone running agents via API, that means meaningfully fewer tokens and faster calls on every request. Context window extends to one million tokens. Coding improves on the previous version. WebArena benchmark hits 67.3%.

For everyday ChatGPT users this will feel like a marginal improvement — slightly smarter web research, slightly better documents. For developers running production agent workflows, the tool search feature changes the cost structure of every call. Those are two genuinely different products sharing a name. Know which one you are evaluating.

Who This Was Actually Built For

OpenAI hired Peter Steinberger — creator of OpenClaw — the same week GPT 5.4 shipped. The launch demos are all agent outputs: a theme park simulation from a single prompt, a turn-based strategy game, a 3D flyover simulation. Not chat outputs. OpenAI is building infrastructure for autonomous agent systems. ChatGPT's subscriber base is funding the compute. The marginal improvement everyday users notice is not the point of the model, and never was.

Available on Plus, Team, and Pro. Replaces GPT 5.2 as the default thinking model.

Open Source's Best Week in Months

While OpenAI and Anthropic managed institutional fires, open source shipped across every major content category in the same seven days.

LTX 2.3: The Production Blocker Is Gone

Lightricks releases LTX 2.3. The headline is native audio — not added in post, not a separate pipeline, generated alongside the video in the same model. Dialogue is clean. Up to 20 seconds at 1080p, 4K at 50fps, portrait and vertical formats, camera motion controls, end-frame interpolation. Eight steps. Free on HuggingFace.

Every serious evaluation of open-source video over the past year ended in the same place: impressive, but no audio. That qualification no longer applies. If you have been waiting for a production-ready open-source video model, you have run out of reasons to wait.

Two New Image Editing Champions

Fire Red ImageEdit 1.1 takes the open-source image editing benchmark crown — beating Qwen ImageEdit and LongCat ImageEdit across the board, matching Nano Banana Pro on several tests. Face consistency across full outfit and pose changes, complex multi-reference image merging, font and text style transfer from reference photos — all handled cleanly. At 60GB it needs quantised versions before most setups can run it locally. Those are coming.

HY Woo from Tencent takes a different approach. It generates a custom fine-tune on the fly from your reference images and injects it into the generation at the moment of creation. The result is clothes swapping and style transfer that beats every open-source alternative and most paid tools except Nano Banana 2 and Pro. Current hardware requirements are steep but a lighter version is confirmed incoming. The approach itself — instant custom fine-tune at inference time — is worth understanding regardless of which model you eventually run it on.

Qwen 3.5: Runs on Your Phone. No Internet Required.

Alibaba releases Qwen 3.5 in four sizes down to 0.8 billion parameters — a 2 gigabyte model that runs on a phone, offline, without a cloud connection. The 9 billion parameter version benchmarks on par with GPT5 Nano and Gemini 2.5 Flash Light on instruction following, graduate-level science, multilingual knowledge, and visual reasoning.

A free, offline, pocket-sized model that matches the fast-tier cloud offerings from OpenAI and Google. Every enterprise argument for mandatory cloud AI inference just got harder to make.

Spectrum: 3.5x Faster, Everything You Already Use

Spectrum from ByteDance is not a new model. It is a speed multiplier that sits on top of image and video generators you already use — Flux, SDXL, Hunyuan Video, Wan 2.1. Apply it and your generations run 3.5 times faster with no loss in quality. Competing speed tools at equivalent gains visibly degrade output. Spectrum doesn't. Code released, image and video both supported. Implement this on what you already have before evaluating anything new.

The Tools That Also Shipped

NotebookLM Cinematic Overviews — Google integrates its video generation and image models into NotebookLM to produce animated video summaries with genuine motion graphics — not slides, not voiceover on static images, actual animation. Upload a document, get a produced video. Currently locked to the $250/month Ultra plan. The quality is what you would previously have opened After Effects for. Watch when this reaches lower pricing tiers.

KiwiEdit — Open-source video editor with style transfer, background replacement, and object add/remove via reference images. Beats every open-source video editing alternative available. Still behind Kling. Best free option by a clear margin. Code on GitHub now.

CUDA Agent (ByteDance) — AI that writes, tests, and optimises the low-level code that makes AI models run efficiently on GPUs. Outperforms the leading frontier models at this specific task. If compute cost is a meaningful line item in your operation, this is worth an afternoon.

What Agencies Do Next

Ship LTX 2.3 into a production video workflow this week. Native audio was the blocker. It is gone. Free weights, fast pipeline, vertical format support out of the box. There is no remaining argument for waiting.
Apply Spectrum to existing Flux and Hunyuan workflows before evaluating anything else. 3.5x speed on infrastructure you already have, no quality cost, nothing new to learn. Do this first.
Test GPT 5.4 tool search on one live agent workflow. The token reduction per call is real and compounds at scale. Measure it on something in production before drawing conclusions from benchmarks.
Know your Anthropic contingency. The supply chain designation is official. Talks are resuming and resolution is the likely outcome — but map which workflows depend on Claude and identify which models could cover them. Do it before you need the answer under pressure.

Bangkok8 AI: We'll tell you which open-source models are ready for production — before you find out the hard way they aren't.

Loading post...

Post not found

This Week in AI: Anthropic Goes to War, OpenAI Builds for Agents, and Open Source Eats the World