

This article is not financial advice. For financial advice, consult a financial professional.
AI’s outlook as an affordable, labor-saving API connection looks tenuous at best. A key set of factors converged on the industry that businesses and end users will increasingly feel as the first quarter of 2026 ramps up. We propose that businesses take control of their AI costs as soon as possible. The AI race will be won by those who can run AI on existing, rather than new, hardware.
There are three major contributors to the compute crunch.
First, the insatiable demand for hardware to capture first- and second-mover advantage has left suppliers reeling, with end-consumers seeing massive cost increases for things like DRAM (the memory used in desktop & laptop computers). These crucial components are built by only three major suppliers, one of which has abandoned its consumer line altogether in an attempt to satisfy this appetite. It has now reached the point that rumors are circulating about a potential line of memory-less GPUs, requiring board partners or end-users to purchase their own VRAM (the memory used to train AI models on GPUs) at current market price.
Second, the U.S. Federal Government has green-lit the sale of NVIDIA H200 chips to China, which was previously embargoed in an attempt to safeguard American dominance in the market. In addition to opening a huge new market for AI chips, it will further juice demand for an already-strained supply chain.
Finally, political instability in the Strait of Taiwan, which separates mainland China and the island of Taiwan, has been the subject of intense military drills by the Chinese government. The eponymous Taiwan Semiconductor Manufacturing Company (TSMC) is located there, and holds two thirds of all semiconductor (read: computer chip) manufacturing capacity. The island of Taiwan’s independence is subject to active dispute by the government of China, which considers it a rogue province.
While the scaling demand for components like DRAM is unlikely to end any time soon, it could be exacerbated by either a spike in foreign GPU demand (from China or otherwise) or by any commercial disruption arising from the dispute over Taiwan.
Those who believe they will be insulated from these problems by leveraging cloud hyper scalers such as AWS, Azure, or GCP are likely in for a shock. The aforementioned DRAM shortage has already impacted them, with orders for sticks going only partially fulfilled. Complicating matters is the pushback that these companies (and other competitors) have received from local communities as anti-data center sentiment continues to expand in local communities. This pushback is no longer isolated and is becoming endemic across the U.S.
The following paragraphs contains inferences and educated hypotheses based on the observations above.
These pressures will have to cash out for these businesses somewhere. The first wave will likely stem from spot-price instances (using “spare” capacity in the cloud, and thus subject to frequent price changes), followed by on-demand users (the traditional “pay-as-you-go” model), and then finally trickling up to larger businesses when their Enterprise contracts come for renewal. This inference is based simply on the ease with which hyper scalers may alter these prices, and while the exact timing of this scenario depends on many factors, it seems increasingly likely to come true.
At the time of writing, Amazon Web Services has raised their GPU-enhanced EC2 instance cost by as much as 15%.
We previously recommended that businesses needing a lot of AI users switch to a pay-per-token model for long-term cost-effectiveness. The likely market changes impact this recommendation substantially. While the scenario illustrated therein remains sound, the instability of the market provides a substantial incentive, but it seems unlikely that even that pricing model will remain unaffected.
A straightforward flowchart for AI adoption may no longer be viable by the end of the quarter. At minimum, we recommend that organizations examine their vulnerability to the price shocks that may be incoming and look to control them.
One way to hedge against these risks for firms with substantial seat subscriptions will likely be to move as quickly as possible from per-seat pricing to per-token pricing at the end of your current contract, or as soon as possible if you are licensed per-seat per-month.
For companies with significant cloud contracts or large amounts of provisioned capacity, we recommend a contingency plan for handling sharp spikes in compute (and/or token) cost from your cloud provider as they continue to see increased demand with limited ability to expand capacity.
Finally, we recommend that enterprises begin experimentation immediately with open-source or open-weight models (working with appropriate security and compliance teams) for workloads currently being vended to proprietary models, including potentially moving these workloads on-premise if necessary to insulate themselves from these price shocks. Since these price changes will only be negotiable for larger enterprises, mid-size firms will find this most urgent.
As the compute crunch hits, efficiency and efficacy become the name of the game for companies without bottomless Venture Capital budgets. Are you looking to build AI solutions that don’t break the bank and show ROI? Reach out to Concord to get started, or add me on LinkedIn to start the conversation!
Not sure on your next step? We'd love to hear about your business challenges. No pitch. No strings attached.