NVIDIA Reports Inference Now Exceeds Training in Datacenter Revenue

What Happened

NVIDIA disclosed that inference workloads surpassed training in Q1 2026 datacenter revenue for the first time, with inference now representing roughly 55% of GPU compute hours sold. The shift reflects production AI deployments at scale across enterprise customers, with reasoning models driving particularly heavy inference demand.

My Take

This crossover is the clearest signal yet that AI has moved from R&D to operations — and it changes the competitive landscape. Inference is more price-sensitive than training, more latency-sensitive, and more amenable to specialized chips from AMD, Groq, Cerebras, and the hyperscaler in-house designs. NVIDIA's moat narrows on inference. For investors, this is the moment to question the "NVIDIA wins everything" thesis and look hard at the second tier. For buyers, it means inference price-per-token will keep falling fast.

Read Original Source