China vs. U.S. AI: The Gap Just Disappeared

01Enterprise AI, in trouble

AI deployments are tearing companies apart.

A new Writer report surveyed 2,400 global leaders. 79% of enterprises describe material AI-adoption challenges. 54% of C-suite executives say their internal AI initiatives are tearing the company apart. 97% say they deployed AI agents in the past year. Only 52% of employees report actually using them.

79%

Enterprises facing material adoption challenges

54%

C-suite says AI is tearing the company apart

97%

Say they deployed AI agents

52%

Employees who actually use them

Five holdbacks explain the mess. Strategy without substance — 75% admit their AI strategy is more show than guidance. A two-tiered workplace — 92% are cultivating an AI elite; 60% plan to lay off workers who can’t or won’t adopt. Trust-resistance cycles. Siloed development. And no business-team ownership of the workflows.

The ROI on AI deployment is still mostly vibes. Companies are laying people off based on gains that don’t yet exist.

02A fun aside

Tell Claude to speak in caveman English. Cut your tokens 75%.

A viral Reddit post this week — 10,000 upvotes — showed that asking Claude to respond in caveman English cut output tokens by roughly 75% with negligible quality loss. Output tokens are usually the smaller part of the bill, but the savings appear to be real. Claude no speak caveman soon. But we can hope the big players tighten up.

03The headline

China caught the US frontier.

Stanford HAI’s 2026 AI Index landed this week. The headline: China has effectively closed the capability gap with US frontier models. Generative AI hit 53% population adoption in just three years. Every index cycle produces the same ritual — China panic, compute panic, VCs screenshotting whichever chart fits their thesis. This year the charts are actually worth reading.

04The four levers

How they got there without the chips.

China doesn’t have the best Nvidia chips — export controls saw to most of that. No OpenAI, no Anthropic, no Google. Smaller training budgets. Four things got them there anyway.

Efficiency & Open Weights

Trained smaller and leaner. Shipped open.

EfficiencyChinese labs learned to train strong models on less compute because they had to. DeepSeek is the famous case.
Open weightsWhile US labs lock the best models behind APIs, China ships Qwen, DeepSeek, and GLM as open-weight releases any developer can use.

State & Talent

Backed by industrial policy and a bigger pipeline.

StateTraining runs, power, and data access coordinated with industrial policy. No US equivalent.
TalentLarger engineering pool, less regulatory friction, faster public iteration.

One interesting twist. Early reports from the same index suggest DeepSeek’s next flagship, V4, is stalling. The efficiency playbook that got them this far may be running into diminishing returns. The next frontier may finally require what they haven’t had — compute.

05Where agents fail

Humans still win where it matters most.

A Nature study published this week benchmarked leading AI agents against expert scientists on open-ended research tasks. The kind that require experimental design, literature synthesis, and iterative reasoning. Humans won across most categories.

The agents failed particularly hard on tasks where the right move was to abandon the hypothesis. They’d keep going until they found a solution — even if no solution was there. No amount of compute fixes sunk-cost blindness. You need a different objective function.

Jim’s take

The headlines will run this as a feel-good “humans still matter” story. The actual finding is more useful. Labs claiming autonomous-scientist capabilities within twelve months are underestimating how load-bearing that gap is. I’d bet the 2027 index reports only marginal progress on this specific axis.

Bigger lesson for any company deploying agents: the failure mode that will burn you is not when the agent doesn’t know the answer. It’s when the agent is sure of the wrong answer — and won’t back off.

06Pricing

OpenAI launched a $100 tier. Same brackets as Claude.

We predicted OpenAI would respond to Anthropic’s pricing before the summer. It happened a lot faster. OpenAI slotted a new $100-a-month ChatGPT tier between the $20 and $200 plans. If that lineup sounds familiar — Claude Max already has $100 and $200 tiers. OpenAI isn’t inventing a bracket. They’re matching Claude’s exact structure so developers can compare apples to apples.

What the pricing says

OpenAI has quietly conceded that Claude is setting the pricing benchmark for serious coding.

They’re the ones responding now.

What the devs say

Surveys now show roughly 70% of professional engineers prefer Claude for coding.

On model quality alone, Codex can’t pull them back. Competitive axis is narrowing to capacity and price.

07The category settles

Two years ago, a new coding tool every week. Now: three names.

Cursor. Windsurf. Claude Code. Throw Codex in there. Windsurf’s new SWE 1.5 model matches Claude Sonnet on accuracy while running 14× faster at $20/month with a $200 max tier. Cursor still positions around precision and developer control. Claude Code added better multi-repo handling.

Jim’s take

If you run engineering, the main cost is no longer the tool. It’s the decision fatigue of letting every developer pick their own. Pick one. Standardize. Move on.

If you’re a developer, the skill worth practicing isn’t which tool is best. It’s how to review AI output fast enough that it’s actually a time-saver. The people who look most productive this year aren’t the ones using the fanciest agent. They’re the ones with the sharpest review reflex.

08What it adds up to

The gap closes. The hard parts don’t.

01Enterprise AI is mostly vibes ROI — and the layoffs are happening anyway.
02The frontier capability gap between US and China is effectively closed.
03China did it on efficiency, open weights, state policy, and talent — not chips.
04Agents still fail on tasks that require abandoning a hypothesis. That gap is load-bearing.
05Claude is setting the pricing benchmark. OpenAI is responding.
06Coding tools have consolidated. Your review reflex matters more than your tool choice.

The failure mode that will burn you is not when the agent doesn’t know the answer. It’s when the agent is sure of the wrong answer and won’t back off.

The gap just disappeared.

AI deployments are tearing companies apart.

Tell Claude to speak in caveman English. Cut your tokens 75%.

China caught the US frontier.

How they got there without the chips.

Trained smaller and leaner. Shipped open.

Backed by industrial policy and a bigger pipeline.

Humans still win where it matters most.

OpenAI launched a $100 tier. Same brackets as Claude.

Two years ago, a new coding tool every week. Now: three names.

The gap closes. The hard parts don’t.

Watch on YouTube.

The gap just disappeared.

AI deployments are tearing companies apart.

Tell Claude to speak in caveman English. Cut your tokens 75%.

China caught the US frontier.

How they got there without the chips.

Trained smaller and leaner. Shipped open.

Backed by industrial policy and a bigger pipeline.

Humans still win where it matters most.

OpenAI launched a $100 tier. Same brackets as Claude.

Two years ago, a new coding tool every week. Now: three names.

The gap closes. The hard parts don’t.

Watch on YouTube.

Keep reading.

Private AI

Anthropic’s Power Move

The Scariest AI