The Energy Conundrum of Home AI
To really achieve the level of “Ubiquitous AI” many dream of, current models are unsustainable. Society, governments and businesses need prioritise efficiency gains.
Yes, they need to prioritise efficiency gains, smaller models and consider electrification demands. Why?
The arrival of a new competitor from China shocked many in late January 2025. It was the fastest downloaded app in Apple’s App Store, but it was not trained on $5,57M as uninformed media spun for days. DeepSeek not only performed on par with existing models, it had been trained with new efficiencies that I analysed in that article. There was little surprise for those of us developing AI solutions and keeping track of publications. Since the release of ChatGPT3.5 just weeks before Christmas 2022, most novel ideas in the field had Chinese names and surnames as authors in academic papers. When not, Chinese names and surnames populated the co-author positions. OpenAI published its ChatGPT3.5 paper and took a strategic turn to become “ClosedAI”. Meta took the challenge to become the “open-source champion”, and generously provided Llama to the community. Yet training these models was well beyond the capabilities and pocket of most enterprises. The most the rest of us could aspire to was to choose which AI model out of the 6-7 in the world we’d use in the future.
Just as restrictive commerce laws by the Bourbons only encouraged massive amounts of “alternative commerce” (that is, smuggling), hidden local wealth and eventually the creation of independence movements in the Spanish Americas, a ban on Graphic Processing Units (GPU) by the US on China only encouraged alternative engineering solutions in a nation bent on technology independence. (The Bourbons also tried to make everyone pay taxes both sides of the Atlantic, which is hard to palate after more than 200 years of not doing so).
Chinese research focused on an efficiency revolution in AI training. An example of this is GaLore (Gradient Low-Rank Projection) from 2024. GaLore was remarkedly unnoticed as a paper in 2024. It describes a training strategy that allows full-parameter learning but is more memory-efficient than usual low-rank adaptation methods, such as LoRA. You may wonder why we would need to train our own AI model in a home GPU. I’ll explain more later, as well as the power implications for society as a whole. For now, let’s consider that young, feverishly nationalistic scientists at DeepSeek operate in a society obsessed with self-sufficiency—one where optimising every step, eliminating inefficiencies, and minimizing dependencies are ingrained in their thought processes. This contrasts sharply with certain US corporate cultures, where the pursuit of 'bigger and more colossal' solutions, regardless of cost, is often the default mindset.
GaLore reduces optimizer state memory usage by up to 65.5%, enabling 7B-parameter model training on consumer GPUs (e.g., NVIDIA RTX 4090), injecting 19,4B tokens. A quick online search shows these gaming-grade cards average just over $3,000, though some can be found for as low as $1,600. While I’m not claiming DeepSeek’s scientists used GaLore specifically, their ethos aligns closely with its principles. Rumours suggest they’ve bypassed hardware limitations—either through hardware modifications (like removing pins to parallelise chips) or by developing training methods independent of Nvidia’s proprietary compilation tools (as explained in their paper “DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning”). This could theoretically allow any GPU to be used, which might explain Nvidia’s recent stock decline. Whatever the case, the outcome is clear: entrenched monopolies are eroding, and access to advanced AI is becoming democratised. Anything’s possible. Even other hardware manufacturers producing cheaper GPUs or GPU alternatives. Huawei is my favourite to shock the market soon.
The Rise of Home AI: Sovereign Computing Takes Shape for Ubiquitous AI
The vision of “sovereign AI”—where households or businesses run localised, private models—is now feasible. A home server with a 7B-parameter model, trained using GaLore, could operate on a single 24GB GPU, consuming minimal power compared to industrial-scale data centres. This would mean adding the equivalent of a high-end home PC to your energy bill. You might wonder why I insist so much on private models. Let me tell you about two conversations I had with two scientists, one from my company, Pangeanic, the other from a UK government department.
The first conversation stems from a family discussion and the need of this young family to multitask and juggle work, raising two young kids, and manage education and information. The idea of a home server not connected to the internet with a small language model that easily classifies data input, finds and stores important bills, copies of passports and IDs, tax returns, children’s homework, family pictures, updates your calendar on household and family duties and saves all sort of information made perfect sense to me from a family point of view and a business proposal. I wish I’d had one 20 years ago. The second conversation also pivoted on small language models that were not connected to the internet, yet provided all sort of deep detection, classification, translation and transcription duties.
Ubiquitous AI will help us in ways unkown
Both trajectories align perfectly with a recent Mozilla evaluation that demonstrates that even models small enough to run on consumer hardware can perform complex domain-specific tasks like clinical summarization with high accuracy. And this creates a compelling case for what I’ve began to describe as “Home AI” systems that:
- Preserve data sovereignty to keep sensitive family information (passports, photos, tax documents) on premises;
- Operate with reasonable, modest hardware requirements utilising distilled, quantized models like DeepSeek-R1-Distill-Llama-8B-Q4_K_M;
- Maintain high-quality results and achieve performance that's only marginally lower than state-of-the-art cloud models
The Mozilla team's evaluation using Llamafile versions of these models is particularly relevant. Llamafile packages allow models to be distributed and run as single files without complex infrastructure, making them ideal for home deployment. (I know developers always want the perfect work of art, but remember the old saying: “Don’t let the great be the enemy of the good”.)
Mozilla’s evaluation proves small, distilled models are good for local deployment
This breakthrough allows individuals and small businesses to train or fine-tune models locally—such as a “family AI” system for managing private data (tax records, family archives, etc.)—without relying on energy-intensive cloud infrastructure, and add to this advancements in Federated Learning where it is the model that comes to the data, it is not the data that is sent to the model, for privacy-preserving architectures. I touched upon this in my first talk in London at Big Data & AI World and it was my key recommendation to an audience increasingly concerned with data being their prime asset (pharmaceuticals, government, consulting, media). Your data is knowledge, so why would you give it away for free to closed AI systems in exchange for some chats or summaries?
Part of my presentation in London on how to use Federated Learning for privacy
Startups may soon offer pre-configured ‘AI-in-a-box’ solutions, combining efficient training frameworks with user-friendly interfaces. While this decentralisation reduces reliance on Big Tech (my second point today), it raises questions about energy consumption at scale. Point three is ‘Ubiquitous AI’: as compact systems operate everywhere—even in devices like ‘AI Armchairs’ (see image above)—the projected increase in electricity demand might prove far more modest than we imagine in 2025. Today, we’re still in the infancy of AI systems, using them primarily as convenient tools: a knowledge repository (despite hallucinations), a conversational partner, a writing assistant, a translator, a coder, or a math tutor. Yet their proliferation into everyday objects could redefine efficiency expectations altogether.
Do We Need a Massive Model for Every Task?
That recent research from Mozilla's evaluation of DeepSeek R1 models for clinical summarisation provides a perfect case study in right-sizing AI applications. Their findings directly challenge the assumption that bigger models are always better. Pangeanic has deployed a DeepSeek 14B, conveniently hacked so it answers questions the official model does not and our own tests point to a worthy model even for basic translation tasks, and very good for general language technology tasks like Named-Entity Recognition. To summarise Mozilla’s findings:
- A 14B parameter model (DeepSeek-R1-Distill-Qwen-14B) achieves nearly identical performance to the full 671B parameter DeepSeek R1 on clinical summarization tasks
- Several smaller, quantized models delivered surprisingly strong results even with significantly reduced computational requirements
- The results varied by only 1% between the 14B, 32B, 70B, and full 671B models on G-Eval metrics
As Nathan Brake, Davide Eynard, Dimitris Poulopoulos, and Irina Vidal Migallón concluded in their analysis: “The question becomes not 'Which model is the best?', but rather, 'What's the smallest model that will get the job done?'”
And The Consequences of The Rise of Home AI and Sovereign Computing
It doesn’t take much to think that if “Home AI” and “Sovereign AI” are strongly in the cards (which explains the shivers that the release of DeepSeek sent down so many spines), the next logical step is on-device AI—something that even Silicon Valley companies were considering before January 25. Consider these points:
1. Personal sovereignty over data: Family photos, financial records, passport information, and tax documents would remain within the physical boundaries of the home. Family and friend networks that push users away from social media just for the sake of sharing pictures. (If your aim is to get “likes” from people you don’t know across the globe, that is another matter).
2. Continuous learning from corporate / organisation / family context: The system could build a knowledge graph specific to your circumstances (business, family, understanding relationships, preferences, and needs). This would not be shared for 3rd party training.
3. Subscription models might be affected: I’m not sure how deep this could run, but there would be a disruption in some “freemium” and subscription models, where users are the actual product providing tones of data to the company with their comments, posting, pictures, etc. Rather than paying monthly fees to cloud AI providers, a one-time investment in hardware could provide ongoing AI services. There may be collateral consequences, just as self-driving cars or even a subscription model to buy/share a car would affect the market of car parks in cities, and even re-shape some of our houses (designed to park a car that families and individuals may not use).
This approach to “Sovereign AI” could indeed become a significant business opportunity. We're already witnessing the early indicators of a market shift from centralised cloud AI to edge and home-based AI solutions. The parallel with personal computing is apt—what was once the domain of large institutions is becoming accessible to individuals. Loosing $700,000 a day in data centre operations does not sound like a compelling business case even when your revenue grows more than 1300%. You may end up having to charge $200 or $20,000 for your models and scare the people you set up to serve. Worse still, you may call your own government to intervene and ban the competition.
As reported by TechCrunch
Finally, The Energy Implications
It is clear that an increase in the proliferation of home AI systems, alongside increasing EV adoption and home electrification, would translate into growing power demands. However, the picture is more complex than it might initially appear. I welcome comments, as this is still an informed speculation.
1. Efficiency gains change the equation: Technologies like GaLore and further and further experiments like Pangeanic’s and Mozilla’s evaluations demonstrate that small, self-hosted AI systems are increasingly efficient, requiring less computational power to achieve similar results to the ones we get now with large LLMs. The comparison here would be a gasoline-hungry early 20th century model to current automobiles (and a complete change of paradigm with EVs).
2. Distributed load vs. centralised demand: Home-based AI distributes power consumption across the grid rather than concentrating it in data centres. These systems could well be powered by solar panels, wind power or local energy sources, just like some EVs are recharged at home and there’s power left to charge batteries for home power. I’ve know of companies that install solar panels that power small servers that mine cryptocurrency.
From a government point of view, a balanced approach to energy infrastructure seems most prudent. Nuclear power provides reliable baseload generation with minimal carbon emissions, while renewable energy offers increasingly cost-effective and rapidly deployable capacity. The ideal mix would likely include strategic nuclear plants for stable baseload power, expanded solar and wind capacity for peak demands, grid-scale storage to balance intermittency, modern grid infrastructure that can handle bidirectional power flows
But in any case, the current rush for ultra, mega power plants for AI seems too short-sighted if it is assuming the current technology will scale.





Love this!