We published this news article on the weekend due to what had happened in the last few hours, it is not just another Artificial Intelligence news. It is a indication of where the sector is heading at the moment. And unfortunately, it is not good news.
The United States government has restricted access to the models Fable 5 and Mythos 5 of Anthropic to foreigners, over national security reasons. This has led to, alleged by Anthropic itself and several other sources like Reuters o The Verge, to suspend abruptly the access to these models, affecting foreign users and employees.
We are not going to mention if it was the right choice or not. However, it is important to know that this shows that Artificial Intelligence in cloud environments are not dependant only over the technical quality of the model, nor the price per token, nor the business that provides the service. It is also dependant on political, geostrategic and regulatory decisions.
And that changes a lot of things.
Over the years, it has always been said that the Cloud was the path for everything: more scalable, comfortable and simple. And in some of the cases it still is. But when a company depends on models that are out of its control, it also depends on the model's conditions, prices, limits, political changes and, as seen recently, governmental decisions that can alter the access any day.
At the same time, another movement has been running in parallel: the local AI models and open weights have been improving in a rapid pace. Specially the chinese models published in the last months, such as DeepSeek, Qwen, Kimi or GLM, that have been proving that so much work can be done with local AI, without internet and without sending internal data to external platforms.
According to AI Index 2026 de Stanford HAI, the performance gap between the best models from the US and the chinese models has been practically closing in several metrics, with slight differences in 2026. Epoch AI estimates that since 2023 the best chinese models have been, in average, 7 months outdated comparing them to the US models with a range of four months and fourteen months. And in other news from Epoch, they also mention that models that can be executed in consumer hardare have a delay from six to twelve months.
In other words: we never say that local models can replace the best Cloud models in the world in all tasks. But we say that, for most of the day to day average work, local models are already good enough.
Resuming documents. Translating texts. Classifying mails. Reading internal documentation. Helping customer support. Analysing system logs. Writing drafts. Creating internal assistants. Responding questions about procedures. Preparing reports. Reviewing text. Querying information inside a private database. And doing AI agents for businesses: Agents that can consult documentation, calling tools, helping employees, preparing responses, revising incidents, interpreting registers or helping with internal processes with data that does not need to leave the organization.
For most of these tasks, the questions is not anymore "Can I do it only in Cloud?". The question now is starting to be: "Is it reasonable for this data to be sent outside the company and pay for each query, if I can do it locally with total privacy, control and less cost?"
To this, another factor gets added: the real cost of AI in the cloud. Even though some providers have reduced prices per token in some models, the total cost of a company could rise rapidly when the AI chat goes from an occasional tool to be integrated in real procedures. An agent does not just do a simple question, it reads context, calls tools, review documents, generate answers, correct, summarise and tries again. Each step consumes tokens.
The frontier models still have considerable big proces. Claude Fable 5, for example, appears with a price of 10 dollars per million of tokens for input and 50 dollars per million of tokens for output in the official page of Anthropic. OpenAI also published GPT-5.5 with API prices of 5 dollars per million of tokens for input and 30 dollars per million of tokens for output, and a version of GPT-5.5 Pro with 30 dollars per million of tokens for input and 180 dollars per million of tokens for output, according to its official communication. And Microsoft has announced price updates in Microsoft 365 related to the new AI capabilities, security, and managements, effectively from July 1st 2026 according to its official communication.
For this reason, a lot of companies have been looking differently to local hardware. Not just as an initial payment, but as an infrastructure. Like back in the day with having local servers, backups, NAS, firewalls or powerful workstations being an strategic decisions, now executing local AI starts to be like this.
But here is the catch: everyone is going to the same conclusions at the same time.
AI needs memory. Lots of memory. RAM, VRAM, fast SSDs, bandwitdh and processing power. And that demand not only affects thebig data centers: it also affects the consumer market of laptops, desktops, workstations, miniPCs, small servers and devices for companies.
TrendForce foresees that contract price of conventional DRAM will increase between a 58% and a 63% in Q2 of 2026, meanwhile NAND Flash, the memory used in SSDs, might increase between 70% and 75%. Clearly, it is due to manufacturers redirecting its memory to HBM, servers, and business-graded SSDs and AI related applications.
This means that the final user will see an increase in RAM modules, NVMe SSDs and configurations with more memory. And we are not mentioning an isolated incresae. TrendForce also indicates that in its May 2026 update that the inventory from providers stay low and the rising tendency on DRAM for PC will extend to Q3 and Q4 of 2026, according in its report DRAM Contract Price May 2026.
Orientative index of memory price
Base Q4 2025 = 100. Q1 and Q1 use midpoints of price increases, published by TrendForce. Q3 and Q4 are a projection of a stressed scenary to visualise the tendency.

Note: the graphic does not represent an official quotation of the market. It is just an orientative visualisation based on the data published by TrendForce for Q1 and Q2 of 2026, and the rising tendency of Q3 and Q4.
The SSD market has also seen worrying indications. Biwin, the SSD and memory modules manufacturer, signed a contract of 1.860 million of dollars to secure NAND supply over two years. When a manufacturer commits to such levels to secure memory supply, it is not over a whim: it is done due to supply, high prices and difficulty to access components in the open market. Mentions Tom’s Hardware.
Also, few days ago AMD seemed to go to the same heading. David McAfee, VP of AMD, afirms that the high amount of manufacturing is giving high hopes, but points that a price regulation of DDR5 should come in 2028. In other words: even AMD is talking about a normalisation of prices that is not imminent, but seems to be related to high amounts of manufacturing of memory over the following years. You can read more about it in Overclock3D.
And in the graphics card sector, the situation has not been inviting to wait. The low supply of memory is already affecting the cost of manufacturing and price of the GPUs. Some high range models from NVIDIA, like the RTX 5070 Ti, RTX 5080 or the RTX 5090, have suffered price increases and low supply. The RTX 5090, in particular, has disappeared from some listings or appeared with higer prices than MSRP, according to the price monitoring of PC Gamer.
This is specially relevant for people that want to use local AI. In gaming, a graphics card the factor to buy is FPS. In local AI, VRAM is what matters the most. It is not the same to execute small models, internal assistants or lightweight tasks, than to work with bigger models, large context, vision capable models, advanced RAG or several users at the same time.
For this reason, in Slimbook we think that it could be an error to wait to buy hardware for a lot of companies and professionals.
It does not mean that you need to buy without thinking. It means that planification is key. If a company knows that in the following months it will need devices with more RAM, bigger SSDs, powerstations with GPUs or local AI servers, it might not be a good idea to wait for the prices to drop or lower
The Cloud will continue to stay. The frontier models will still be important. But not everything needs to depend in an external infrastructure, in an external API or a provider that could change its prices, limits or conditions of access.
The local AI is not just a technical issue. It provides security, soberanity, foreseeable costs and continuity.
And now is also over hardware disponibility.
For this, if you are still waiting to buy hardware thinking that prices will lower, you might be doing an error. Local AI, open models being more capable every day, business demand, memory shortages, SSD prices and high prices in the charts point to the other direction.
In Slimbook we have been defending devices ready to work on Linux with full privacy and control. Today this philosophy applies too: having your own hardware is not just looking at the past. It is getting ready for the future.
Future directions: our readings for the next months is cautious but clear: RAM and SSDs will still be in shortage, graphic cards with lots of VRAM will still continue to have high prices and with irregular availability, and the market regulation will not enter until 2028 when the new manufacturing efforts enter.
If you need a device for local AI, you can see our desktop catalog and workstations in Slimbook. And if you are searching for something more customisable than just an RTX 5090, an eGPU or a DGX Spark, we also sell and configure AI servers. You can write us at [email protected] and we will study your case.