Measuring progress toward AGI: A cognitive framework (blog.google)

by surprisetalk 214 comments 151 points
Read article View on HN

214 comments

[−] pocketarc 59d ago
When people imagined AI/AGI, they imagined something that can reason like we can, except at the speed of a computer, which we always envisioned would lead to the singularity. In a short period of time, AI would be so far ahead of us and our existing ideas, that the world would become unrecognizable.

That's not what's happening here, and it's worth remembering: A caveman from 200K years ago would have been just as intelligent as any of us here today, despite not having language or technology, or any knowledge.

In Carolyn Porco's words: "These beings, with soaring imagination, eventually flung themselves and their machines into interplanetary space."

When you think of it that way, it should be obvious that LLMs are not AGI. And that's OK! They're a remarkable piece of technology anyway! It turns out that LLMs are actually good enough for a lot of use cases that would otherwise have required human intelligence.

And I echo ArekDymalski's sentiment that it's good to have benchmarks to structure the discussions around the "intelligence level" of LLMs. That _is_ useful, and the more progress we make, the better. But we're not on the way to AGI.

[−] onlyrealcuzzo 59d ago
The amount of things LLMs can do is insane.

It's interesting to me how much effort the AI companies (and bloggers) put into claiming they can do things they can't, when there's almost an unlimited list of things they actually can do.

[−] imtringued 59d ago
This reminds me of "Devin". You know, the first "AI software engineer", which had the hype of the day but turned into a huge flop.

They had ridiculous demos of Devin e.g. working as a freelancer and supposedly earning money from it.

[−] roncesvalles 59d ago
We're waaay past the era when getting funded meant your idea had any promise at all.
[−] mlmonkey 59d ago
It looks like the company (Cognition) is actively hiring (20+ job openings last I checked). That doesn't sound like a "flop" to me...
[−] skeeter2020 59d ago
Think about: why would they be hiring actual human beings if Devin actually works? Seems like the purest example of "dogfooding"...
[−] jorvi 59d ago
This generally just keeps being the "the Emperor has no clothes" moment for all these AI bull companies.

Microsoft just replaced their native Windows Copilot application with an Electron one. Highly ironic.

Obviously the native version should run much faster and will use less memory. If Copilot (via either GPT or Claude) is so godlike at either agentic or guided coding, why didn't they just improve or rewrite the native Copilot application to be blazing fast, with all known bugs fixed?

[−] notTooFarGone 59d ago
When you think about it, every job opening is a flop in that sense.
[−] paxys 58d ago
WeWork had 12,500 employees at its peak.
[−] beeflet 59d ago
And many of them so unexpected, given the unusual nature of their intellegence emerging from language prediction. They excel wherever you need to digest or produce massive amounts of text. They can synthesize some pretty impressive solutions from pre-existing stuff. Hell, I use it like a thesaurus to sus out words or phrases that are new or on the tip of my tounge. They have a great hold on the general corpus of information, much better than any search engine (even before the internet was cluttered with their output). It's much easier to find concrete words for what you're looking for through an indirect search via an LLM. The fact that, say, a 32GB model seemingly holds approximate knowlege of everything implies some unexplored relationship between inteligence and compression.

What they can't they do? Pretty much anything reliably or unsupervised. But then again, who can?

They also tend to fail creatively, given their synthesize existing ideas. And with things involving physical intuition. And tasks involving meta-knowlege of their tokens (like asking them how long a given word is). And they tend to yap too much for my liking (perhaps this could be fixed with an additional thinking stage to increase terseness before reporting to the user)

[−] gtowey 59d ago
Only because they have compressed and encoded the entire sum of human knowledge at their disposal. There are models for everything in there, but they can only do what has been done before.

What's more amazing to me is the average human, only able to hold a relatively small body on knowledge in their mind, can generate things that are completely novel.

[−] SecretDreams 59d ago
The hype has gotta keep going or the money will dry up. And hype can be quantified by velocity and acceleration, rather than distance. They need to keep the innovation accelerating, or the money stops. This is of course completely unreasonable, but also why the odd claims keep happening.
[−] lich_king 59d ago
Because most of these things are not multi-trillion-dollar ideas. "We found a way to make illustrators, copyeditors, and paralegals, and several dozen other professions, somewhat obsolete" in no way justifies the valuations of OpenAI or Nvidia.
[−] NooneAtAll3 59d ago
for example?
[−] SkyPuncher 59d ago
I've been pushing Opus pretty hard on my personal projects. While repeatability is very hard to do, I'm seeing glimpses of Opus being well beyond human capabilities.

I'm increasingly convinced that the core mechanism of AGI is already here. We just need to figure out how to tie it together.

[−] imetatroll 59d ago
This is a bit of an anti-evolutionary perspective. At some point in our past, we were something much less intelligent than we are now. Our intelligence didn't spring out of thin air. Whether or not AI can evolve is yet to be seen I think.
[−] nurettin 59d ago

> caveman from 200K years ago would have been just as intelligent as any of us here today, despite not having language

There is evidence to the contrary. Not having language puts your mental faculties in a significant disadvantage. Specifically, left brain athropy. See the critical period hypothesis. Perhaps you mean lacking spoken language rather than having none at all?

https://linguistics.ucla.edu/people/curtiss/1974%20-%20The%2...

[−] mhl47 59d ago
How do you arrive at the statement that a cavemen would have the same intelligence as a human today? Intelligence is surely not usually defined as the cognitive potential at birth but as the current capability. And the knowledge an average human has today through education surely factors into that.
[−] Traubenfuchs 59d ago

> A caveman from 200K years ago would have been just as intelligent as any of us here today, despite not having language or technology, or any knowledge.

Doubt. If we would teleport cavemen babies right out of the womb to our times, I don't think they'd turn into high IQ individuals. People knowledgeable on human history / human evolution might now the correct answer.

[−] rl3 59d ago

>

In a short period of time, AI would be so far ahead of us and our existing ideas, that the world would become unrecognizable.

>That's not what's happening here ...

On the contrary, it very much is.

I'd argue AGI is already achieved via LLMs today, provided they've excellent external cognitive infrastructure supporting.

However, the gap from AGI to ASI is perhaps longer than anticipated such that we're not seeing a hard takeoff immediately after arriving at the first.

Just, you know—potential mass unemployment on a scale never seen before. When you frame it that way, whether LLMs qualify as AGI is largely semantics.

That said, I really hope you're right and I'm wrong.

[−] tyleo 59d ago
It still seems like something is missing from all these frameworks.

I feel like an average human wouldn't pass some of these metrics yet they are "generally intelligent". On the other hand they also wouldn't pass a lot of the expert questions that AI is good at.

We're measuring something, and I think optimizing it is useful, I'd even say it is "intelligent" in some ways, but it doesn't seem "intelligent" in the same way that humans are.

[−] orangebread 59d ago
As an engineer who is also spiritual at the core, it seems obvious to me the missing piece: consciousness.

Hear me out.

I love AI and have been using it since ChatGPT 3.5. The obvious question when I first used it was "does this qualify as sentience?" The answer is less obvious. Over the next 3 years we saw EXPONENTIAL intelligence gains where intelligence has now become a commodity, yet we are still unable to determine what qualifies as "AGI".

My thoughts: As humans, we possess our own internal drive and our own perspective. Think of humans as distilled intelligence, we each have our own specialty and motivations. Einstein was a genius physicist but you wouldn't ask him for his expertise on medicine.

What people are describing as AGI is essentially a godlike human. What would make more sense is if the AGI spawned a "distilled" version with a focused agenda/motivation to behave autonomously. But even then, there are limitations. What is the solution? A trillion tokens of system prompt to act as the "soul"/consciousness of this AI agent?

This goes back to my original statement, what is missing is a level of consciousness. Unless this AGI can power itself and somehow the universe recognizes its complexity and existence and bestows it with consciousness I don't think this is phsyically attainable.

[−] yellow_lead 59d ago
It's kind of funny that Google's idea of evaluating AGI is outsourcing the work to a Kaggle competition.