Compute Shortage, Overbooked Components: What’s Really Going On?
SOXX, the semiconductor index, closed on Friday, April 26, 2026 at 461. That is an increase of about 50% since the index bottomed on March 30, 2026. We are talking about a 50% move in less than a month.
We have not witnessed such an increase in a semiconductor ETF since the dot com era, which had a similar irrational buying panic across semiconductor stocks just a few months before the indices peaked in March 2000.
Just to help you realize how aggressively the market is buying anything semiconductor related, or any company that may be connected to semiconductors:
During March 30, 2026 to April 26, 2026
Intel did about 100%
AMD did about 85%
Nvidia did about 27%
Broadcom did about 45%
Marvell Technologies did about 90%
SanDisk did about 78%
SK Hynix Inc did about 50%
KOSPI, which is largely driven by memory stocks, gained over 28%, and the traditional TAIEX rose more than 22%, which may sound modest, but stunned the average taiwanese investor.
And so on.
There is one simple driver behind this. Everyone wants a stake in a company that creates a component needed for compute. It does not matter which component. GPU, CPU, memory, or storage, as long as it is required to build compute for AI. Investors are hitting the buy button without hesitation.
They saw memory stocks deliver triple and even quadruple digit gains over the past year and do not want to miss the next train. But why just now the sudden FOMO?
In recent months, we have been hearing from AI startups that they are struggling to secure compute. GPU pricing has been rising and pushing toward all time highs. There are waiting lists just to get access to compute. AI labs are increasing API pricing due to strong demand and limited supply of compute.
This combination, fear of missing out after the memory frenzy and signals of compute shortage, is what is driving investors into anything tied to semiconductors.
Every morning at 9:30am, investors rush to click the buy button on these stocks to be part of this demand train.
But is it sustainable?
In this article, I analyze what is going on behind the scenes and the reasons driving all of the above.
Before I start: thank you to everyone who subscribes, follows, and supports this work. Very soon I will be moving to a paid subscription model, delivering the same content, analysis, and due diligence you have come to expect, but more frequently, and with a subscribers-only chat where we can discuss everything together, and much more. Until then, if you enjoy what I write, please subscribe and/or pledge using the button below. It is genuinely appreciated.
The Incredible Delivery Speed of Anthropic
In recent months, we have been witnessing an incredible delivery speed from Anthropic:
October 2025 — Computer Use as an API capability.
November 2025 — Claude Code
January 2026 — Claude Cowork
February 2026 — Sonnet 4.6
March 2026 — Claude Dispatch launched, Claude for Excel and PowerPoint,
Full Computer Use Agent in Cowork and Claude Code for Pro/Max subscribers.
April 2026 — Claude Mythos, Opus 4.7, Claude Design, Claude for Word.
I have probably missed some, but that’s not the point. The point is to show the incredible pace of delivery at Anthropic. Why do I mention it? Because it creates several effects across the industry:
First, it created real FOMO, a “code red” emergency mode across the labs and industry to push delivery of new tools, add-ons, and LLM versions to production to compete against Anthropic. Seeing Anthropic eat your market share is painful, and no AI lab wants to give up a single piece of growth. This is the core story for investors, and since none are currently profitable, with no sight of profitability yet, they need to keep growth at all costs.
In addition, they have to show they are relevant. It doesn’t matter if what they ship will be discontinued after a few months (Sora 2, anyone?), but at least it creates an effect for a time and calms down investors. The valuations of all labs are already stretched and they need to deliver. The speed of Anthropic’s delivery is creating FOMO and stress.
As a result, all AI labs are rushing products, which leads to even more compute-hungry models or new features that increase demand across the industry. But there is one small issue: Datacenters take years to build. These AI labs need compute now, right now. Anthropic is shipping, so everyone else needs to ship, and they all need compute.
They are fighting over existing capacity, which increases compute pricing and GPU rental costs. You hear across the industry that even Nvidia A100s are popular again, which some use as an opportunity to mock those who expressed concerns about GPU depreciation.
So while the narrative is “short on compute” and demand is insane, in fact there is no shortage of physical components. Lots and lots of GPUs are currently collecting dust, waiting to be deployed when their assigned data centers are ready. We all hope they will be ready on time, successfully complete their financing rounds, gain quick access to power and specialized cooling, and that the hyperscaler or neocloud will succeed in calming the local community and proving that the data center is beneficial.
The Efficiency Paradox and Token Hunger
Second, with each model and product shipping, they consume even more tokens than before. Anthropic’s delivery speed comes at a cost to efficiency. In fact, there is no real talk on efficiency. For example, it is being reported that Opus 4.7 is significantly more "token hungry" than 4.6, often consuming 20%–35% more tokens. Even if the numbers vary, you likely experience this in the newer models of any LLM you are using: they are much more token-hungry.
Technically, there is a solution to the compute shortage: making models more efficient. Having them perform the same tasks using fewer tokens. However, as the improvements with each model become less and less significant, few are prioritizing efficiency. Improving model efficiency? You have to be kidding me.
Delivery is what matters. Showing newer models and new products is what matters to investors and the narrative. With IPO plans on the horizon, you need the buzz. Efficiency is not in the vocabulary. The same applies to OpenAI, xAI, and others. They are all in a laser-focused mode, needing to keep up with Anthropic at any price, even if the new models use more tokens than the previous ones for the same task.
This FOMO isn’t just about rushing to deliver new models and secure compute for upcoming hungry products; it also creates secondary effects. xAI reportedly wants to overpay for Cursor at a $60B valuation just to keep up with Codex and Claude Code. OpenAI acquired Hiro Finance, OpenClaw, Astral, and Promptoo all in the last three months. They are rushing to buy any tool or company that may help show delivery, no matter what the price tag might be.
Growing User Bases and the Compute Bottleneck
The cycle is self-sustaining:
More AI tools and addons are released.
The average user increases their tool usage.
Newer models use more tokens per task.
Result: A massive surge in token demand.
Since datacenters take an average of three years to build, the demand for active compute is skyrocketing. More players are competing for the same fixed pool of live capacity. This leads to:
Increased GPU hourly pricing.
Increased API costs.
The removal of heavily subsidized plans (like the removal of Claude Code from $20 tiers).
Consumption-based (per-token) pricing for enterprise clients and third-party tool users.
Teasing of subscription price hikes.
Companies are stockpiling components for the future because no one wants to be caught short. The “prevailing wisdom” is that it's better to end up with oversupply than to be caught in a shortage. It doesn’t help that Jensen didn’t exactly broadcast that he would create a memory supply bottleneck, as next-generation Nvidia GPUs require significantly more memory. Now, everyone is terrified of the next shortage and is overbooking everything: Memory, Storage, Cooling, and CPUs. Even “Agentic” workflows have proven to work great on CPUs, which has fueled the rally for Intel and AMD.
Nvidia’s “Beat and Raise” Machine
Nvidia must sustain its “Beat and Raise” momentum. The current estimates are massive even without the “Raise”:
2026: $370B
2027: $485B
2028: $562B
Nvidia’s last 10-K revealed over $100B in supply commitments. More recently, through the podcast with Jensen and Dwarkesh Patel, we’ve been teased with $250B in supply commitments. This is necessary for Jensen to book the capacity needed to support nearly $1.5T in projected revenue.
Jensen is effectively cannibalizing the supply market for his GPUs and systems. By booking hundreds of billions in commitments, he leaves little for other players. While he ensures Nvidia has the hardware to sustain the “Beat and Raise” cycle, other players have had to bid any price for the remaining supply in a state of FOMO.
With all of that supply, hundreds of billions of components for compute and massive volumes of GPUs, it is unclear if hyperscalers and neoclouds will ever build enough datacenters to accommodate them. In fact, even considering the ones currently under construction or planned, there is likely not enough capacity to accommodate all the GPUs Jensen is flooding the market and plans to flood with even more GPUs in the future.
The Risk of the Great Write-Off
Let me be clear: GPUs are not in short supply. The GPUs that are currently deployed and functional are in short supply. The rest are waiting in warehouses. Those who booked supply components out of FOMO are sitting on mountains of inventory that is currently “collecting dust.”
Eventually, AI labs may realize that because datacenters take so long to build, they must think of model efficiency. If models become more efficient, it may render redundant the sheer number of GPUs Jensen is planning to flood the market with.
Every player in this space is currently risking a massive future write-off. We are looking at a potential dramatic decrease in the price of CPUs, Memory, and GPUs once this frenzy calms down. This will accelerate the depreciation of these assets. Currently, hyperscalers and neoclouds are sitting on hundreds of thousands of GPUs that are essentially depreciating assets waiting for a home, not including all the depreciating assets they are stockpiling now and will continue to stockpile in the near future.
I could discuss each of these points in much greater depth, but the objective here is to look past the hype and see the structural risks forming behind the scenes.
K.


"If you go around popping a lot of balloons, you're not going to be the most popular guy in the room"
- Charlie Munger
Stop it, you are making too much sense....
Jokes aside, the writing on the wall is very clear....