When people imagined AI/AGI, they imagined something that can reason like we can, except at the speed of a computer, which we always envisioned would lead to the singularity. In a short period of time, AI would be so far ahead of us and our existing ideas, that the world would become unrecognizable.
That's not what's happening here, and it's worth remembering: A caveman from 200K years ago would have been just as intelligent as any of us here today, despite not having language or technology, or any knowledge.
In Carolyn Porco's words: "These beings, with soaring imagination, eventually flung themselves and their machines into interplanetary space."
When you think of it that way, it should be obvious that LLMs are not AGI. And that's OK! They're a remarkable piece of technology anyway! It turns out that LLMs are actually good enough for a lot of use cases that would otherwise have required human intelligence.
And I echo ArekDymalski's sentiment that it's good to have benchmarks to structure the discussions around the "intelligence level" of LLMs. That _is_ useful, and the more progress we make, the better. But we're not on the way to AGI.
It's interesting to me how much effort the AI companies (and bloggers) put into claiming they can do things they can't, when there's almost an unlimited list of things they actually can do.
And many of them so unexpected, given the unusual nature of their intellegence emerging from language prediction. They excel wherever you need to digest or produce massive amounts of text. They can synthesize some pretty impressive solutions from pre-existing stuff. Hell, I use it like a thesaurus to sus out words or phrases that are new or on the tip of my tounge. They have a great hold on the general corpus of information, much better than any search engine (even before the internet was cluttered with their output). It's much easier to find concrete words for what you're looking for through an indirect search via an LLM. The fact that, say, a 32GB model seemingly holds approximate knowlege of everything implies some unexplored relationship between inteligence and compression.
What they can't they do? Pretty much anything reliably or unsupervised. But then again, who can?
They also tend to fail creatively, given their synthesize existing ideas. And with things involving physical intuition. And tasks involving meta-knowlege of their tokens (like asking them how long a given word is). And they tend to yap too much for my liking (perhaps this could be fixed with an additional thinking stage to increase terseness before reporting to the user)
This is a bit of an anti-evolutionary perspective. At some point in our past, we were something much less intelligent than we are now. Our intelligence didn't spring out of thin air. Whether or not AI can evolve is yet to be seen I think.
> caveman from 200K years ago would have been just as intelligent as any of us here today, despite not having language
There is evidence to the contrary. Not having language puts your mental faculties in a significant disadvantage. Specifically, left brain athropy. See the critical period hypothesis. Perhaps you mean lacking spoken language rather than having none at all?
How do you arrive at the statement that a cavemen would have the same intelligence as a human today? Intelligence is surely not usually defined as the cognitive potential at birth but as the current capability. And the knowledge an average human has today through education surely factors into that.
> A caveman from 200K years ago would have been just as intelligent as any of us here today, despite not having language or technology, or any knowledge.
Doubt. If we would teleport cavemen babies right out of the womb to our times, I don't think they'd turn into high IQ individuals. People knowledgeable on human history / human evolution might now the correct answer.
> A caveman from 200K years ago would have been just as intelligent as any of us here today, despite not having language or technology, or any knowledge.
Source? This does not sound possibly true to me (by any common way we might measure intelligence).
I posted my own comment but I agree with you. Our modern society likes to claim we are somehow "more intelligent" than our predecessors/ancestors. I couldn't disagree more. We have not changed in terms of intelligence for thousands of years. This is a matter that's beyond just engineering, it's also a matter of philosophy and perspective.
Humans, like all animals, have not stopped evolving. A random caveman from 200K years ago would have very different genetics to that of a typical HN reader and even more so of the best of the HN readers.
Around 3,200 years ago there was a notable uptick in alleles associated with intelligence.
It still seems like something is missing from all these frameworks.
I feel like an average human wouldn't pass some of these metrics yet they are "generally intelligent". On the other hand they also wouldn't pass a lot of the expert questions that AI is good at.
We're measuring something, and I think optimizing it is useful, I'd even say it is "intelligent" in some ways, but it doesn't seem "intelligent" in the same way that humans are.
As an engineer who is also spiritual at the core, it seems obvious to me the missing piece: consciousness.
Hear me out.
I love AI and have been using it since ChatGPT 3.5. The obvious question when I first used it was "does this qualify as sentience?" The answer is less obvious. Over the next 3 years we saw EXPONENTIAL intelligence gains where intelligence has now become a commodity, yet we are still unable to determine what qualifies as "AGI".
My thoughts:
As humans, we possess our own internal drive and our own perspective. Think of humans as distilled intelligence, we each have our own specialty and motivations. Einstein was a genius physicist but you wouldn't ask him for his expertise on medicine.
What people are describing as AGI is essentially a godlike human. What would make more sense is if the AGI spawned a "distilled" version with a focused agenda/motivation to behave autonomously. But even then, there are limitations. What is the solution? A trillion tokens of system prompt to act as the "soul"/consciousness of this AI agent?
This goes back to my original statement, what is missing is a level of consciousness. Unless this AGI can power itself and somehow the universe recognizes its complexity and existence and bestows it with consciousness I don't think this is phsyically attainable.
It's good to have some kind of benchmark at least to structure the ongoing, fruitless discussion around "are we there already?".
However I must admit that including the last point that is partially hinting at the emotional or rather social intelligence surprised me. It makes this list go beyond usual understanding of AGI and moves it toward something like AGI-we-actually-want. But for that purpose this last point isn't ok narrow, too specific. And so is the whole list.
To be actually useful the AGI-we-actually-want benchmark should not only include positive indicators but also a list of unwanted behaviors to ensure this thing that used to be called alignment I guess.
Every week we are 50% closer to shifting the goalpost...
from the paper "AI systems already possess some capabilities not found in humans, such as LiDAR perception and native image generation". I don't know about them, but I can natively generate images in my mind.
h) pattern recognition & inductive reasoning (this is the most primitive and universal expression of intelligence across species, the ability to extract regularities from noisy data, to generalized from examples to rules)
To me, a lot of what makes us sentient is our continuity. I even (briefly) remember my dreams when I wake up, and my dreams are influenced by my state of mind as I enter it.
LLMs 'turn on' when given a question and essentially 'die' immediately after answering a question.
What kind of work is going on with designing an LLM type AI that is continuously 'conscious' and giving it will? The 'claws' seem to be running all the time, but I assume they need rebooting occasionally to clear context.
Cool that we are at a stage where it is meaningful to start measuring progress toward AGI. Something I am wondering on the philosophical side: are we ever going to be able to tell if the system really "understands" and "perceives" the world?
Those are crowdsourced benchmarks. We're calling them "cognitive" and "AGI" now, though. It's similar to when they made a benchmark and called it "GDP".
To be clear, I think we've seen very fast progress, certainly faster than I would have expected, I'm not trying to peddle some "wall" rhetoric here, but I struggle to see how this isn't just the SWE-bench du jour.
Altruism would make a good addition to the list. It’s clearly not universal, but most humans would help a fellow human in need. Or even (and in some cases more so) an animal need. Even if it didn’t directly benefit the actor.
There are other changes and additions which could be made to this list, but altruism may be the most important.
Maybe Google could actually make Gemini good instead of being about 10 miles behind Claude instead of trying to make AGI because of - well some reason - cause they want to be famous.
AGI may be a prerequisite for true superintelligence, but we're already seeing superhuman performance in narrow domains. We probably need a broader evaluation framework that captures both.
The belief that there is no fundamental difference between mammals navigating fractal dimensions and imprisoned electrons humming in logic gates has to be considered a religious one.
I'm sorry what even is this? Giving $10k rewards for significant advancements toward "AGI"?
What does "making a framework" even mean, it feels like a nothing post.
When I think of what real AGI would be I think:
- Passes the turing test
- Writes a New York Times Bestseller without revealing it was written by AI
- Writes journal articles that pass peer review
- Wins a Nobel Prize
- Writes a successful comedy routine
- Creates a new invention
And no, nobody is going to make an automated kaggle benchmark to verify these. Which is fine, because an LLM will never be AGI. An LLM can't even learn mid-conversation.
Way too much framework. The A in AGI is for artificial. Have it build its own test harness instead of outsourcing it via hackathon. If you cannot trust that output, you're nowhere near AGI.
214 comments
That's not what's happening here, and it's worth remembering: A caveman from 200K years ago would have been just as intelligent as any of us here today, despite not having language or technology, or any knowledge.
In Carolyn Porco's words: "These beings, with soaring imagination, eventually flung themselves and their machines into interplanetary space."
When you think of it that way, it should be obvious that LLMs are not AGI. And that's OK! They're a remarkable piece of technology anyway! It turns out that LLMs are actually good enough for a lot of use cases that would otherwise have required human intelligence.
And I echo ArekDymalski's sentiment that it's good to have benchmarks to structure the discussions around the "intelligence level" of LLMs. That _is_ useful, and the more progress we make, the better. But we're not on the way to AGI.
It's interesting to me how much effort the AI companies (and bloggers) put into claiming they can do things they can't, when there's almost an unlimited list of things they actually can do.
They had ridiculous demos of Devin e.g. working as a freelancer and supposedly earning money from it.
What they can't they do? Pretty much anything reliably or unsupervised. But then again, who can?
They also tend to fail creatively, given their synthesize existing ideas. And with things involving physical intuition. And tasks involving meta-knowlege of their tokens (like asking them how long a given word is). And they tend to yap too much for my liking (perhaps this could be fixed with an additional thinking stage to increase terseness before reporting to the user)
> caveman from 200K years ago would have been just as intelligent as any of us here today, despite not having language
There is evidence to the contrary. Not having language puts your mental faculties in a significant disadvantage. Specifically, left brain athropy. See the critical period hypothesis. Perhaps you mean lacking spoken language rather than having none at all?
https://linguistics.ucla.edu/people/curtiss/1974%20-%20The%2...
> A caveman from 200K years ago would have been just as intelligent as any of us here today, despite not having language or technology, or any knowledge.
Doubt. If we would teleport cavemen babies right out of the womb to our times, I don't think they'd turn into high IQ individuals. People knowledgeable on human history / human evolution might now the correct answer.
>
In a short period of time, AI would be so far ahead of us and our existing ideas, that the world would become unrecognizable.>That's not what's happening here ...
On the contrary, it very much is.
I'd argue AGI is already achieved via LLMs today, provided they've excellent external cognitive infrastructure supporting.
However, the gap from AGI to ASI is perhaps longer than anticipated such that we're not seeing a hard takeoff immediately after arriving at the first.
Just, you know—potential mass unemployment on a scale never seen before. When you frame it that way, whether LLMs qualify as AGI is largely semantics.
That said, I really hope you're right and I'm wrong.
> A caveman from 200K years ago would have been just as intelligent as any of us here today, despite not having language or technology, or any knowledge.
Source? This does not sound possibly true to me (by any common way we might measure intelligence).
> That _is_ useful, and the more progress we make, the better.
I would be happy to agree if we had the solutions for the societal problems that will create in hand.
> A caveman from 200K years ago would have been just as intelligent as any of us here today
In other words, intelligence offers zero evolutionary advantage?
Around 3,200 years ago there was a notable uptick in alleles associated with intelligence.
I feel like an average human wouldn't pass some of these metrics yet they are "generally intelligent". On the other hand they also wouldn't pass a lot of the expert questions that AI is good at.
We're measuring something, and I think optimizing it is useful, I'd even say it is "intelligent" in some ways, but it doesn't seem "intelligent" in the same way that humans are.
Hear me out.
I love AI and have been using it since ChatGPT 3.5. The obvious question when I first used it was "does this qualify as sentience?" The answer is less obvious. Over the next 3 years we saw EXPONENTIAL intelligence gains where intelligence has now become a commodity, yet we are still unable to determine what qualifies as "AGI".
My thoughts: As humans, we possess our own internal drive and our own perspective. Think of humans as distilled intelligence, we each have our own specialty and motivations. Einstein was a genius physicist but you wouldn't ask him for his expertise on medicine.
What people are describing as AGI is essentially a godlike human. What would make more sense is if the AGI spawned a "distilled" version with a focused agenda/motivation to behave autonomously. But even then, there are limitations. What is the solution? A trillion tokens of system prompt to act as the "soul"/consciousness of this AI agent?
This goes back to my original statement, what is missing is a level of consciousness. Unless this AGI can power itself and somehow the universe recognizes its complexity and existence and bestows it with consciousness I don't think this is phsyically attainable.
However I must admit that including the last point that is partially hinting at the emotional or rather social intelligence surprised me. It makes this list go beyond usual understanding of AGI and moves it toward something like AGI-we-actually-want. But for that purpose this last point isn't ok narrow, too specific. And so is the whole list.
To be actually useful the AGI-we-actually-want benchmark should not only include positive indicators but also a list of unwanted behaviors to ensure this thing that used to be called alignment I guess.
from the paper "AI systems already possess some capabilities not found in humans, such as LiDAR perception and native image generation". I don't know about them, but I can natively generate images in my mind.
> Perception: extracting and processing sensory information from the environment
> Generation: producing outputs such as text, speech and actions
> Attention: focusing cognitive resources on what matters
> Learning: acquiring new knowledge through experience and instruction
> Memory: storing and retrieving information over time
> Reasoning: drawing valid conclusions through logical inference
> Metacognition: knowledge and monitoring of one's own cognitive processes
> Executive functions: planning, inhibition and cognitive flexibility
> Problem solving: finding effective solutions to domain-specific problems
> Social cognition: processing and interpreting social information and responding appropriately in social situations
--------------------
I prefer:
a) working memory (hold & manipulate information in mind simultaneously)
b) processing speed (how quickly & efficiently execute basic cognitive operations, leaving more resources for complex tasks)
c) fluid intelligence (ability to reason through novel problems without relying on prior knowledge)
d) crystallized intelligence (accumulated knowledge and ability to apply learned skills)
e) attentional control / executive function (focus, suppress irrelevant information, switch between tasks, inhibit impulsive responses)
f) long-term memory and retrieval (ability to form strong associations and retrieve them fluently)
g) spatial / visuospatial reasoning (mental rotation, visualization, navigating abstract spatial relationships)
h) pattern recognition & inductive reasoning (this is the most primitive and universal expression of intelligence across species, the ability to extract regularities from noisy data, to generalized from examples to rules)
LLMs 'turn on' when given a question and essentially 'die' immediately after answering a question.
What kind of work is going on with designing an LLM type AI that is continuously 'conscious' and giving it will? The 'claws' seem to be running all the time, but I assume they need rebooting occasionally to clear context.
Is social cognition really a measure of intelligence for non-social entities?
To be clear, I think we've seen very fast progress, certainly faster than I would have expected, I'm not trying to peddle some "wall" rhetoric here, but I struggle to see how this isn't just the SWE-bench du jour.
There are other changes and additions which could be made to this list, but altruism may be the most important.
Who cares about AGI? Honestlky what's the gain.
Maybe Google could actually make Gemini good instead of being about 10 miles behind Claude instead of trying to make AGI because of - well some reason - cause they want to be famous.
You'd have a more serious debate about antigravity.
What does "making a framework" even mean, it feels like a nothing post.
When I think of what real AGI would be I think:
- Passes the turing test
- Writes a New York Times Bestseller without revealing it was written by AI
- Writes journal articles that pass peer review
- Wins a Nobel Prize
- Writes a successful comedy routine
- Creates a new invention
And no, nobody is going to make an automated kaggle benchmark to verify these. Which is fine, because an LLM will never be AGI. An LLM can't even learn mid-conversation.
How will they measure wisdom or common sense (ability to make an exception)?
https://youtu.be/lA-zdh_bQBo