ONE: the Mini-PC for local AI with Ollama, llama.cpp or NPU
Comparison charts of CPUs, memories and models to set up your Artificial Intelligence and Agents
Listen to a summary in Spanish:


The local Artificial Intelligence is not just a thing for big servers or desktops with huge graphics card. Nowadays, a compact device like the SLIMBOOK ONE can execute AI models locally to create private assistants, analyse documents, summarise texts, help with programming, translate, query knowledge databases or work with data without sending it constantly to the Cloud.

But the question is: Which models can the SLIMBOOK one really run?

What models you can run depending on RAM

ConfigurationRecommended modelsRecommended usecaseCommentary 
16 GB RAMLlama 3.2 1B / 3B
Qwen2.5 3B
Gemma 2 2B
Phi-3 Mini / Phi-3.5 Mini
DeepSeek-Coder 1.3B
Some quantised 6B / 7B
Basic chatbot
Summarise short-length texts
Classification of information
Lightweight translation
Simple personal assistant
Valid configuration to start with local AI or for lightweight tasks. It is not the most recommended if you want to work usually with more capable models
32 GB RAMLlama 3.1 / 3.2 8B
Qwen2.5 7B
Mistral 7B
Gemma 2 9B
DeepSeek-Coder 6.7B
Embeddings such as Nomic Embed, BGE o MiniLM 
Local office assistant small team
Private small chatbot
Document queries
Lightweight RAG with documentation
Text generation
Basic help with programming
For most users, 32GB already lets you experience AI locally properly, specially with quantised models and simple tools like Ollama
64 GB RAMQwen2.5 14B
Qwen2.5-Coder 14B
Mistral Nemo 12B
Llama 3.1 8B with large context
DeepSeek-Coder 14B / 16B quantised
Some quantised 20B models
Internal assistants with more capabilities
Help with programming
Log analysis
Technical support
Business documentation
Technical redaction
Local agents with more context 
For us, 64GB of RAM is the sweet spot if you want to use the SLIMBOOK ONE as a serious local AI device. It is the most balanced configuration between size, usage, price and real capabilities
96 GB / 128 GB RAMQwen 32B quantised
DeepSeek 32B quantised
Llama 3.1 70B quantised
Big models in GGUF format
Agents with more context
Document base
Local AI lab
Tests with big models
Research and Development
Privacy
Heavy documentation analysis
Scenarios where local control matters more than maximum speed 
Lets you load big models, but does not need to be confused with high range workstations with a dedicated GPU. A model big in size being able to be loaded does not mean that its speed will be great
ONE + eGPUDepends on the VRAM of the external graphics card Accelered midium inference
Artifical vision moderate
Heavy models
Image generation
Multimedia processing
Dedicated GPU workflows
The most powerful option if you need accelerated graphics. The ONE can be compact in normal day-to-day usage and grow with an eGPU when needed.


AMD 7 H 255 and AI 9 HX 370: CPU vs CPU, and CPU vs NPU

The SLIMBOOK ONE can be configured with different AMD processors. For local AI, the two models to compare are the AMD Ryzen 7 H 255 and the AMD Ryzen AI 9 HX 370

The Ryzen 7 H 255 is a very capable CPU for a miniPC. It contains 8 cores and 16 threads, integrated graphics Radeon 780M and lets you run local models through CPU, RAM and when the software allows it, accelerated graphcis. It is a very interesting option for the ones who want a powerful, compact and balanced device.

The Ryzen AI 9 HX 370, in contrast, goes one step further. It contains 12 cores and 24 threads, integrated graphics 890M, and over all of that, a dedicated NPU for Artificial intelligence, up to 50 TOPS with the NPU and 80 TOPS in total when all the processing blocks are combined.

The NPU provides speed when running AI with the proper software, since the core is designed to be used with algorithms used in Artificial Intelligence.

This is important because we are not just talking of "more CPU". We are talking about a designed architecture ready to be used with the new generations of AI software.

In other words: 

ModelCPUIntegrated GPUNPURecommended use
ONE Ryzen 7 H 2558 cores / 16 threadsRadeon 780MNo NPULocal AI with CPU and RAM, Ollama, llama.cpp, quantised models and general use
ONE Ryzen AI 9 HX 37012 cores / 24 threadsRadeon 890M50 TOPS NPU / 80 TOPS totalLocal AI in higher quality, better CPU, iGPU and NPU with proper software


Then, is the Ryzen AI 9 HX 370 the better choice for AI?

Yes, if we are talking about potential for local artificial intelligence, the Ryzen AI 9 HX 370 is clearly the most interesting choice.

Having more CPU cores, a more powerful integrated graphics, and also a dedicated NPU. That NPU does not replace a dedicated high range graphics card, but it allows you to run some AI workflows more efficient, with less consumption and without CPU or GPU usage.

The key is in the software. Nowadays, local AI tools still use CPU, RAM, integrated GPU or dedicated GPU. But each day, more tools are integrating NPU usage for modern CPUs.

For this reason, an user that just wants to start with local AI, the Ryzen 7 H 255 can be enough. But for those who want a more futureproof device for efficient AI, our recommendation is the SLIMBOOK ONE AI9 with Ryzen AI 9 HX 370.

Ollama and llama.cpp: the simplest way to start

For most of the users, the most easy option to start with local AI is using Ollama.

Ollama lets you download and run models such as Llama, Qwen, Mistral, Gemma, Phi or DeepSeek with simple commands. For example:

ollama run llama3.2

Or also:

ollama run qwen2.5

Ollama also uses technologies like llama.cpp, one of the most important projects of the local AI ecosystem. llama.cpp lets you run models in optimised formats like GGUF, using quantification and making that several language models work in personal devices without depending on the Cloud.

In other words:

  • llama.cpp  is one of the most important technical bases to run local optimised.

  • Ollama lets you run these models with ease.

  • GGUF and the quantised models let you run big models using less memory.

For this reason, to start with local AI with the SLIMBOOK ONE, Ollama and llama.cpp is a recommended combination.

Ollama is ideal for:

  • Users that want to start fast

  • Businesses that want to try local assistants

  • Internal chatbots

  • RAG about documentation

  • Automatisations

  • Tests with different models

  • Prototype development

  • Models like Llama, Qwen, Mistral, Gemma, Phi or DeepSeek

FastFlowLM and the NPU: Efficient AI in Ryzen AI

In the models with AMD Ryzen AI, like the SLIMBOOK ONE AI9, another element appears: the  NPU.

That NPU does not replace a dedicated high range graphics card, but it allows you to run some AI workflows more efficient, with less consumption and without CPU or GPU usage.

Solutions like FastFlowLM are perfect here, designed to use the NPU of the Ryzen AI processors to run the language models.

While Ollama excels at its easy usage and model variety, FastFlowLM aims to use the NPU to run local AI models more efficient.

This can be interesting for:

  • Local assistants that are always active

  • Lightweight inference

  • Compact devices

  • Low consumption

  • Scenarios where we want to reserve the usage of CPU and GPU for other tasks

  • Tests with the integrated NPU of Ryzen AI

In other words:

Ollama is the easiest option to start.
llama.cpp is the most important technical bases to run local optimised models.
FastFlowLM is an advanced option for those who want to experiment with NPU and the efficency of RyzenAI

What is better: Ollama, llama.cpp or FastFlowLM?

They are not exactly the same, and they do not act in the same areas.

ToolExcels atUse case
OllamaEasy usage, lots of models, simple commandsUsers, businesses and developers that want to start fast
llama.cppTechnical base, GGUF models, quantisation, advanced controlTechnical users, integrators and developers
FastFlowLMUsing the NPU of Ryzen AI, efficency, low consumptionAdvanced users that want to experiment with the NPU


Our recommendation would be: 

Ollama to start
llama.cpp for those who want more technical control
FastFlowLM to use the NPU of Ryzen AI 

What about the Integrated GPU?

The SLIMBOOK ONE AI9 incorporates Radeon 890M integrated graphics, meanwhile the Ryzen 7 H 255 incorporates Radeon 780M. These iGPU are more and more powerful with the time and it helps you in some workloads, depending on the operating system, drivers, backend and software compatibility used.

In any case, when we talk of big models, heavy workloads or maximum speed in inference, a dedicated GPU with VRAM will always go ahead iGPUs.

For this reason, it is important to know the different levels:

  1. CPU + RAM
    It is the most universal base. Lets you run several quantised models.

  2. llama.cpp / Ollama
    Eases out the executing of local models and are ideal for starting

  3. NPU Ryzen AI
    Interesting for efficiency, low consumption and new tools like FastFlowLm.

  4. Dedicated GPU or eGPU
    Recommended when you want speed, heavier models or vision/generation tasks.

SLIMBOOK ONE + eGPU: when you need more power

One of the advantages of SLIMBOOK ONE is that you can upgrade with the eGPU, using connections like Oculink or USB-C, depending on configuration or neccesities.

This lets you connect an external graphics card and transforms the ONE in a more powerful machine for AI workloads.

An eGPU can be interesting for:

  • Big models

  • Accelerated inference

  • Artificial vision

  • Image generation

  • Multimedia processing

  • CUDA, ROCm or accelerated graphics workflows

  • Users that want a compact day-to-day use, but want power when needed

The concept is simple: the SLIMBOOK ONE can be an elegant and efficient miniPC for daily usage, and when a project needs power, you can connect an eGPu to multiply your graphic capabilities.

Local AI: privacy, independence and control

Running local AI is not just a performance need, It is also a privacy and control issue.

When we use external Artificial Intelligence services, most of the time we send texts, documents, code, mails, reports or company information to third party infrastructure. For personal usage it might not be a problem, but to businesses, professionals, administrative places or sensitive projects, it can be an important issue.

With Local AI you can:

  • Analysing documents without uploading to the Cloud.

  • Creating private assistants

  • Consulting company internal information.

  • Working with logs, code or sensitive data

  • Minimising dependency on external providers

  • Evading variable costs for using providers

  • Maintaining more control with your tools

The cloud will stay. There is huge models and specialised services that make sense that are not in your local device. But most of the day-to-day tasks can be resolved perfectly with local models.

Our recommendation

If you want to buy a SLIMBOOK ONE for local artificial intelligence, our recommendation would be:

To start

SLIMBOOK ONE with 32 GB of RAM

A good choice for 7B or 8B models, lightweight assistants, tests with Ollama and moderate personal or professional use

Second level of local AI

SLIMBOOK ONE AI9 with 64GB of RAM

The most balanced option. Lets you run comfortably with 7B, 8B, 12B and 14B models, and also offers a better CPU, iGPU and with integrated NPU.

For bigger LLMs and more data

SLIMBOOK ONE AI9 with 128GB of RAM

Recommended for those who want to experiment with big models, more context, complex agents or document bases.

For more speed

SLIMBOOK ONE + eGPU

The most powerful option when you need acceleration with a dedicated GPU, heavy models, artificial vision or exigent AI workflows

Conclusion

The SLIMBOOK ONE proves that local AI does not need to take up excessive space nor to depend always on the Cloud. In a compact form factor, elegant and efficient, you can run models like Llama, Qwen, Mistral, Gemma, Phi o DeepSeek, create private assistants, analyse documents, automatise tasks and work with artificial intelligence under your control.

To start, Ollama is the easiest and recommended option. For those who want more technical control, llama.cpp is one of the key pieces of the local AI ecosystem. For those who want to go one step further, FastFlowLM lets you take advantage of the NPU of the Ryzen AI processors. And for heavy workloads, the SLIMBOOK ONE ecosystem can grow with the eGPU.

It is not just to promise wonders. It is to choose wisely your configuration, model and tools.

And there the SLIMBOOK ONE shines: a small computer powerful enough for the next generation of local artificial intelligence, private and efficient. 

Choose your SLIMBOOK ONE for local AI

If you want to start with local artificial intelligence, you can configurate your SLIMBOOK ONE depending your needs.

SLIMBOOK ONE AMD Ryzen 7 H 255 

SLIMBOOK ONE AMD Ryzen AI 9 HX 370

If you need more power for AI, you can upgrade your SLIMBOOK ONE with one of our Dock eGPU USB4 & OCuLink 800W, designed to be used for heavy workloads and accelerated inference, artificial vision, image generation, multimedia processing or heavy models.

Buy Dock eGPU USB4 & OCuLink 800W

Important: the eGPU dock does not include a dedicated graphics card, sold separately, ask us or visit some available here.

And as a larger and more powerful alternative, be sure to check out our desktop computers, the range Nexus, las Workstation para IA.

# One
ONE: the Mini-PC for local AI with Ollama, llama.cpp or NPU
Alejandro López Slimbook
16 June, 2026
share
Tags
Edit
One
Our blogs
archive