I have always envisioned a ai server being part of a family's major purchases e.g. when they buy a house, appliance, etc. they also buy a 'ai system'.
Machine hardware evolution is slowing down, pretty soon you can buy one big ass server that will last potentially decades as it would be purpose built for ai.
Things like 'context based home security' yeah thats just, automatic, free, part of the ai system.
Everyone will talk to the ai through their phones and it'll be connected to the house, it'll have lineage info of the family may be passed down through generations etc, and it'll all be 100% owned, offline, for the family; a forever assistant just there.
- 6× faster CPU/GPU performance
- 6× faster AI performance
- 7.7× faster AI video processing
- 6.8× faster 3D rendering
- 2.6× faster gaming performance
- 2.1× faster code compiling
Over the span of 5 years.
Plus, realistically what makes an "ai" server different from a computer? This "lineage info of the family may be passed down through generations" sounds nice but do you know anyone passing down a Commodore 64 or Apple II that remains in daily use? I fail to see how "ai" would protect something from obsolescence.
I have a good analogy. 10 years ago, I was convinced that a 24-inch 1080p monitor at arm's length was perfection. There could never be any reason to improve over it. I could do everything I ever wanted to, to a standard I would never need to improve upon.
Yet here we are. The simplest and most obvious improvement is a 24" 4k monitor at 200% scaling. Basically, better in every way.
There's a discussion to be had about whether you need the better setup, which I think is your point, but there's no denying you'd want it (all other variables the same).
At some point specs don’t matter. I don’t wonder about the processor in my thermostat either. I don’t know how many horsepower my XC90 has. I don’t know the rated power of my chainsaw.
All I care about is: do they work, are they ‘safe’, are they comfortable, etc.
Today, not much differentiates them. But as time passes our only option will be to further specialize the hardware to get realistic gains; at some point perhaps a 'purpose built analog' computer kinda thing will get to the point where it is so useful, that it would be like the 'Standard Template Constructs' concept in Warhammer 30k. So what you can make a faster ai but, the current one can 'teach everyone, basically anything'.
It depends on what/how you're comparing. Core to core, according to CPU benchmark, the M1 is 5800 vs the M5 at 3600, so we're still not quite to 2x.
Overall system performance is better at about 2x improvement thanks to extra cores/other improvements/changes. I could see other more specialized benchmarks improving more thanks to different improvements/core/power/size improvements in other components (GPU/NPU/etc...).
If you bought a big ass server for your home 10 years ago it probably wouldn't have even have had a GPU/AI accelerator at all. If it did, it would have been something with wimpy compute and VRAM because you needed the video encoder/decoder for security cameras or the like.
I'm not sure that really gives confidence hardware has really slowed down enough to invest in it for decades. Single core CPU performance has but that's not really what new things are using.
It really just depends on if the hardware is "good enough" for whatever its purpose is. If the hardware today can locally run whatever models for your security cameras, it's likely they will still be "good enough" in 10 years.
Of course, similar to a 10 year old car or appliance, you will be missing any new features or bells and whistles that have become available in the meantime.
I agree; it's important to recognize that there are lots of use cases where computers have long since reached "good enough" and aren't really going obsolete anymore for those use cases.
My NAS is about 13 years old, the network switches it connects through are even older, and while 2.5GbE now exists I have no need throw out my "good enough" equipment to replace with something marginally faster or more power efficient. I don't even really need to expand the storage of that NAS anytime soon, because my music collection could never come close to filling it, my movie/TV collection isn't growing much anymore due to the shift to streaming, and the volume of other stuff that I need to back up from my other computers just isn't growing much over the years.
Decades is a long time for hardware, but "years" seems reasonable soon. The commercial models are "good enough" for a lot of things now, so if that performance makes its way into the on-device space for "home applicance"-level cost (<$5k at the start, basically), I'd expect a lot of stuff to start popping up there. In offices too.
Like the PC in the 80s starting to eat up "get a mainframe" or "rent time on a mainframe" uses.
You’re kindof undermining your own point. Ten years later the only thing you’d need to upgrade for your home server might be the GPU - because a new use-case emerged. Okay? Spend $500-$1000 on an eGPU. Problem solved. Will that eGPU setup last another ten years? If all it’s doing is processing security video and routing claw-like tasks, then yes.
Not sure I follow why - that the server from 10 years ago would be completely unfit for purpose now should not imply the one you buy today would therefore be the right hardware 10 years from now. Unless you can somehow guarantee we've reached the final set of new requirements we will ever have just these last few years the GPUs you buy today will probably be just as irrelevant to the new requirements a decade from now.
Of course one can always upgrade components piecewise as requirements change, but I don't see why you need to invest in a big ass server to do that. It'd be cheaper to go that route everyone has for decades at this point - upgrade with normal sized stuff as needed and not try to make it an up front multi-decade home investment out of it.
On the flip-side, if you intentionally plan to lock in the capabilities to the kinds of things one can run today and know you'll never therefore need to upgrade it then you can get whatever sized system makes sense for today's needs. You just need to be really sure you'll not be interested in "the next big thing" when it comes too.
Yeah but, how long do mainframes last? Think of the COBOL systems used in government. No reason to update them, they worked forever; their job is discrete and they performed it well enough where intense updating wasn't a requirement.
I don’t think there’s anything different between what you’re suggesting and a homelab. Most people do not have a homelab and are happy to offload services like photo storage or security to remote providers.
Based on our current trajectory, it seems more likely everyone will upload everything to the cloud and pay perpetual royalties to access their own data.
This is your reminder we're in a bubble inside of a bubble...
Most people don't even think about running network cables or mesh wifi when building a house, no one will buy a server to run ai in their physical home
This is a very flashy page that's glossing over some pretty boring things.
- This is a benchmark for "home security" workflows. I.e., extremely simple tasks that even open weight models from a year ago could handle.
- They're only comparing recent Qwen models to SOTA. Recent Qwen models are actually significantly slower than older Qwen models, and other open weight model families.
- Specific tasks do better with specific models. Are you doing VL? There's lots of tiny VL models now that will be faster and more accurate than small Qwen models. Are you doing multiple languages? Qwen supports many languages but none of them well. Need deep knowledge? Any really big model today will do, or you can use RAG. Need reasoning? Qwen (and some others) love to reason, often too much. They mention Qwen taking 435ms to first token, which is slow compared to some other models.
Yes, Qwen 3.5 is very capable. But there will never be one model that does everything the best. You get better results by picking specific models for specific tasks, designing good prompts, and using a good harness.
And you definitely do not need an M5 mac for all of this. Even a capable PC laptop from 2 years ago can do all this. Everyone's really excited for the latest toys, and that's fine, but please don't let people trick you into thinking you need the latest toys. Even a smartphone can do a lot of these tasks with local AI.
The M5 Pro just dropped, so here's a real AI workload instead of another Geekbench score. We run Qwen3.5 as the brain of a fully local home security system and benchmarked it against OpenAI cloud models on a custom 96-test suite. The Qwen3.5-9B scores 93.8% — within 4 points of GPT-5.4 — while running entirely on the M5 Pro at 25 tok/s, 765ms TTFT, using only 13.8 GB of unified memory. The 35B MoE variant hits 42 tok/s with a 435ms TTFT — faster first-token than any OpenAI cloud endpoint we tested. Zero API costs, full data privacy, all local. Full results: https://www.sharpai.org/benchmark/
Currently the barrier to entry for local models is about $2500. Funny thing is $2500 is about the amount my parents paid for a 166 MHZ machine in 1995.
This is fantastic, but IMO it misses the most important part of a home security system from a business PoV - the ability to issue an alarm certificate. These are required for insurance discounts, as well as for making certain claims in the event of loss.
This is the classic issue in tech right now - it's becoming easier to build the systems, but the compliance/legal hurdles are still real, slow, and human. Even if the monitoring is best in class (which I'd argue it likely is - this is a fantastic application of AI), if the compliance isn't there it wont be a real product.
Can someone share how this stacks up to a Frigate? What I am struggling with this is how it sits in the security stack. Is it recording things of interest with motion or is it only a layer on top of the existing nvr
Neat, but why would you want a clumsy LLM to know what happened with your security system? Things happened or they didn't, and that's what dashboards are for.
Seems like trying to make a need from the tools. My security system front page shows me every event that happened at my house, and I don't have to interrogate it on every happenstance, and I don't see what the value of that is.
I'd like to recreate this benchmark using Qwopus on my M5 Max. I am curious if the theoretically improved reasoning capabilities from distillation improve its scoring. Adding this one to my to-do list for some point in the next few weeks.
Why would you run this on your M5 instead of a dedicated machine for it? A Jetson Orin would be faster at prefill and decode, as well as cheaper for home installation.
This seems like an inevitable idea: a security system with full context. So you don't get alerts about your friend's car plates or your kid coming home late.
152 comments
Machine hardware evolution is slowing down, pretty soon you can buy one big ass server that will last potentially decades as it would be purpose built for ai.
Things like 'context based home security' yeah thats just, automatic, free, part of the ai system.
Everyone will talk to the ai through their phones and it'll be connected to the house, it'll have lineage info of the family may be passed down through generations etc, and it'll all be 100% owned, offline, for the family; a forever assistant just there.
Plus, realistically what makes an "ai" server different from a computer? This "lineage info of the family may be passed down through generations" sounds nice but do you know anyone passing down a Commodore 64 or Apple II that remains in daily use? I fail to see how "ai" would protect something from obsolescence.
That being said I feel like were gonna get to that point for most other stuff way sooner than AI (and already have for many pieces of software)
I have a good analogy. 10 years ago, I was convinced that a 24-inch 1080p monitor at arm's length was perfection. There could never be any reason to improve over it. I could do everything I ever wanted to, to a standard I would never need to improve upon.
Yet here we are. The simplest and most obvious improvement is a 24" 4k monitor at 200% scaling. Basically, better in every way.
There's a discussion to be had about whether you need the better setup, which I think is your point, but there's no denying you'd want it (all other variables the same).
All I care about is: do they work, are they ‘safe’, are they comfortable, etc.
The GPUs have become much larger, so 6.8x is believable there, as is the inclusion of a matmul unit boosting AI.
The 2.x numbers are the most realistic, especially because they represent actual workloads.
Overall system performance is better at about 2x improvement thanks to extra cores/other improvements/changes. I could see other more specialized benchmarks improving more thanks to different improvements/core/power/size improvements in other components (GPU/NPU/etc...).
I'm not sure that really gives confidence hardware has really slowed down enough to invest in it for decades. Single core CPU performance has but that's not really what new things are using.
Of course, similar to a 10 year old car or appliance, you will be missing any new features or bells and whistles that have become available in the meantime.
My NAS is about 13 years old, the network switches it connects through are even older, and while 2.5GbE now exists I have no need throw out my "good enough" equipment to replace with something marginally faster or more power efficient. I don't even really need to expand the storage of that NAS anytime soon, because my music collection could never come close to filling it, my movie/TV collection isn't growing much anymore due to the shift to streaming, and the volume of other stuff that I need to back up from my other computers just isn't growing much over the years.
Like the PC in the 80s starting to eat up "get a mainframe" or "rent time on a mainframe" uses.
Of course one can always upgrade components piecewise as requirements change, but I don't see why you need to invest in a big ass server to do that. It'd be cheaper to go that route everyone has for decades at this point - upgrade with normal sized stuff as needed and not try to make it an up front multi-decade home investment out of it.
On the flip-side, if you intentionally plan to lock in the capabilities to the kinds of things one can run today and know you'll never therefore need to upgrade it then you can get whatever sized system makes sense for today's needs. You just need to be really sure you'll not be interested in "the next big thing" when it comes too.
ie, something like this fake future apple device page: https://speculate-mai.pages.dev/
> pretty soon you can buy one big ass server that will last potentially decades as it would be purpose built for ai.
This feels like a very, very weak prediction (though certainly possible).
> I have always envisioned a ai server being part of a family's major purchases
and an oxide rack
Most people don't even think about running network cables or mesh wifi when building a house, no one will buy a server to run ai in their physical home
- This is a benchmark for "home security" workflows. I.e., extremely simple tasks that even open weight models from a year ago could handle.
- They're only comparing recent Qwen models to SOTA. Recent Qwen models are actually significantly slower than older Qwen models, and other open weight model families.
- Specific tasks do better with specific models. Are you doing VL? There's lots of tiny VL models now that will be faster and more accurate than small Qwen models. Are you doing multiple languages? Qwen supports many languages but none of them well. Need deep knowledge? Any really big model today will do, or you can use RAG. Need reasoning? Qwen (and some others) love to reason, often too much. They mention Qwen taking 435ms to first token, which is slow compared to some other models.
Yes, Qwen 3.5 is very capable. But there will never be one model that does everything the best. You get better results by picking specific models for specific tasks, designing good prompts, and using a good harness.
And you definitely do not need an M5 mac for all of this. Even a capable PC laptop from 2 years ago can do all this. Everyone's really excited for the latest toys, and that's fine, but please don't let people trick you into thinking you need the latest toys. Even a smartphone can do a lot of these tasks with local AI.
https://github.com/SharpAI/DeepCamera/blob/c7e9ddda012ad3f8e...
This is the classic issue in tech right now - it's becoming easier to build the systems, but the compliance/legal hurdles are still real, slow, and human. Even if the monitoring is best in class (which I'd argue it likely is - this is a fantastic application of AI), if the compliance isn't there it wont be a real product.
Seems like trying to make a need from the tools. My security system front page shows me every event that happened at my house, and I don't have to interrogate it on every happenstance, and I don't see what the value of that is.
> Local-first AI home security
Why would you run this on your M5 instead of a dedicated machine for it? A Jetson Orin would be faster at prefill and decode, as well as cheaper for home installation.
the analysis is very suspicious: “gpt 5 mini had api failures due to wrong temp setting”? wtf?
whatever you used to slop your benchmark didt even take the time to set the temp to 1 (which the docs say is required)