I think going from LSAT to general thinking is still a very, very big leap. Passing exams is a really fascinating benchmark but by their nature these exams are limited in scope, have very clear assessment criteria and a lot of associated and easily categorized data (like example tests). General thought (particularly like, say, coming up with an original idea) is a whole different ball game.
I don't say any of this to denigrate GPT4, it looks amazing. But I'm reminded of the early days of self driving vehicles: with 10% mastered everyone assumed it was a race to 100% and we'd all be in self-driving cars by now. The reality has been a lot more complicated than that.
That's a reasonable goal, but it's also not what people were aiming for historically. It's also very expansive: if human level intelligence means outperforming in every field every human that ever lived, that's a high bar to meet. Indeed, it means that no humans have ever achieved human-level intelligence.
That goalpost makes no sense- AIs are not human. They are fundamentally different, and therefore will always have a different set of strengths and weaknesses. Even long after vastly exceeding human intelligence everywhere it counts, it will still also perform worse than us on some tasks. Importantly, an AI wouldn't have to meet your goalpost to be a major threat to humanity, or to render virtually all human labor worthless.
Think about how anthropomorphic this goalpost is if you apply it to other species. "Humans aren't generally intelligent, because their brains don't process scents as effectively as dogs- and still struggle at spatially locating scents."
> (...)
> That is the goalpost for AGI. It’s an artificial human - a human replacement.
This considerably moves the goalpost. An AGI can have a different kind of intelligence than humans. If an AGI is as intelligent as a cat, it's still AGI.
More likely, the first AGI we develop will probably greatly exceed humans in some areas but have gaps in other areas. It won't completely replace humans, just like cats don't completely replace humans.
'everything a human can do' is not the same as 'anything any human can do as well as the best humans at that thing (because those are the ones we pay)' - most humans cannot do any of the things you state you are waiting for an AI to do to be 'general'.
Therefore, the first part of your statement is the initial goal post and the second part of your statement implies a very different goal post. The new goal post you propose would imply that most humans are not generally intelligent - which you could argue... but would definitely be a new goal post.
GI in AGI stands for general intelligence. If what you said is your benchmark for general intelligence then humans who cannot perform all these tasks to the standard of being hirable are not generally intelligent.
What you're asking for would already be bordering on ASI, artificial superintelligence.
By that definition do humans possess general intelligence?
Can you do everything a human can do? Can one human be a replacement for another?
I don't think it makes sense without context. Which human? Which task?..
I disagree with the premise. A single human isn't likely to be able to perform all these functions. Why do you demand GPT-4 encompass all activities? It is already outperforming most humans in standardized tests that rely only on vision and text. A human needs to trained for these tasks.
It's already a human replacement. OpenAI has already said the GPT-4 "with great impact on functions like support, sales, content moderation, and programming."
This could mean something which is below a monkey’s ability to relate to the world and yet more useful than a monkey.
No, AGI would not need you to start a startup. It would start it itself.
It's a clear analogy.
This should become an article explaining what AGI really means.
I think the question , "Can this AGI be my start-up co-founder? Or my employee #1?"
Or something like that is a great metric for when we've reached the AGI finish line.
There are many things that pattern matching over large amounts of data can solve, like eventually we can probably get fully generated movies, music compositions, and novels, but the problem is that all of the content of those works will have to have been formalized into rules before it is produced, since computers can only work with formalized data. None of those productions will ever have an original thought, and I think that’s why GPT-3’s fiction feels so shallow.
So it boils down to a philosophical question, can human thought be formalized and written in rules? If it can, no human ever has an original thought either, and it’s a moot point.
Do you have evidence that human brains are not just super sophisticated pattern matching engines?
Humans read novels, listen to compositions, watch movies, and make new ones similar in some ways and different in other ways. What is fundamentally different about the process used for LLMs? Not the current generation necessarily, but what's likely to emerge as they continue to improve.
If so it means the union of all human expertise is a few gigabytes. Having seen both a) what we can do in a kilobyte of code, and b) a broad range of human behavior, this doesn't seem impossible. The more interesting question is: what are humans going to do with this remarkable object, a svelte pocket brain, not quite alive, a capable coder in ALL languages, a shared human artifact that can ace all tests? "May you live in interesting times," indeed.
Clearly the key takeaway from GPT is that given enough unstructured data, LLM can produce impressive results.
From my point of view, the flaw in most discussion surrounding AI is not that people underestimate computers but overestimate how special humans are. At the end of day, every thoughts are a bunch of chemical potentials changing in a small blob of flesh.
It is probably true that at a given point many many people had the same or very similar ideas.
Those who execute or are in the right place and time to declare themselves the originator are the ones we think innovated.
It isn't true. Or rarely is true. History is written by the victor (and their simps)
No, and I think it's because human thought is based on continuous inferencing of experience, which gives rise to the current emotional state and feeling of it. For a machine to do this, it will need a body and the ability to put attention on things it is inferencing at will.
To be honest, perhaps the language model works better without the evolutionary baggage.
That isn't to discount the other things we can do with our neural nets - for instance, it is possible to think without language - see music, instantaneous mental arithmetic, intuition - but these are essentially independent specialised models that we run on the same hardware that our language model can interrogate. We train these models from birth.
Whether intentional or not, AI research is very much going in the direction of replicating the human mind.
I have a sneaking suspicion that all that will be required for bypassing the upcoming road blocks is giving these machines:
1) existential needs that must be fulfilled
2) active feedback loops with their environments (continuous training)
We always thought that if AI can do X then it can do Y and Z. It keeps turning out that you can actually get really good at doing X without being able to do Y and Z, so it looks like we're moving the goalposts, when we're really just realizing that X wasn't as informative as we expected. The issue is that we can't concretely define Y and Z, so we keep pointing at the wrong X.
But all indication is that we're getting closer.
> “there are/are not, additional properties to human level symbol manipulation, beyond what GPT encapsulates.”
GPT does appear to do an awful lot, before we find the limits, of pattern extrapolation.
The notion of some sort of technological "singularity" is just silly. It is essentially an article of faith, a secular religion among certain pseudo-intellectual members of the chattering class. There is no hard scientific backing for it.
What, in your mind, should the goal posts be for AGI?
Currently, you could prompt GPT to act as if it is sentient and has qualia, and it will do quite a good job at trying to convince you it's not a P-Zombie.
I know I’m not the first to say this, but this is also a generalization of many jobs performed right now.
Follow the template, click the boxes, enter the text/data in the standard format, submit before 4pm. Come in tomorrow and do it again.
If that automation doesn’t require oversight, everyone wins, since now that process, typing data from a ledger, is free to anyone who wants to use it. The exception of course is if a monopoly or oligopoly controls the process, so it’s up to the government to break them up and keep the underlying tech accessible.
The biggest risk is how much computing power it takes to run these models, so it’s very important to support the open alternatives that are trying to lower the barrier to entry.
Exactly, much like a chess bot can play perfectly without what humans would call thinking.
I think (ironically) we'll soon realize that there is no actual task that would require thinking as we know it.
If that were true, there would be no point in studying or doing any LSAT preparation. Writing practice exams would be of no benefit.
As others have said elsewhere, the issue remains accuracy. I wish every response comes with an accurate estimation of how true the answer is, because at the moment it gives wrong answers as confidently as right ones.
I can remember my GRE coach telling me that it was better to confidently choose an answer I only had 50% confidence in, rather than punt on the entire question.
AIs hallucinate because, statistically, it is 'rewarding' for them to do so. (In RLHF)
Obviously not, since GPT-4 doesn't have general intelligence. Likewise "common sense," "knowledge about the world," nor "reasoning ability."
As just one example, reasoning ability: GPT-4 failed at this problem I just came up with: "If Sarah was twice as old as Jimmy when Jimmy was 1/3 as old as Jane, and Jane is as much older than Sarah as Sarah is older than Jimmy, and Sarah is now 40, how old are Jane and Jimmy?"
First, every answer GPT-4 came up with contradicted the facts given: they were just wrong. But beyond that, it didn't recognize that there are many solutions to the problem. And later when I gave it an additional constraint to narrow it to one solution, it got the wrong answer again. And when I say "wrong," I mean that its answer clearly contradicted the facts given.
Driving as well as an attentive human in real time, in all conditions, probably requires AGI as well.
GPT-4 is not an AGI and GPT-5 might not be it yet. But the barriers toward it are getting thinner and thinner. Are we really ready for AGI in a plausibly-within-our-lifetime future?
Sam Altman wrote that AGI is a top potential explanation for the Fermi Paradox. If that were remotely true, we should be doing 10x-100x work on AI Alignment research.
Now, granted, plenty of humans don't score above a 2 on those exams either. But I think it's indicative that there's still plenty of progress left to make before this technology is indistinguishable from magic.
Sure but look in this thread, there are already plenty of people citing the use of GPT in legal or medical fields. The danger is absolutely real if we march unthinkingly towards an AI-driven future.
Not yet it won't. It doesn't take much imagination to foresee where this kind of AI is used to inform legal or medical decisions.
And medicine is nothing but pattern matching. Symptoms -> diagnosis -> treatment.
Driving assistance and the progress made there and large language models and the progress made there are absolutely incomparable.
The general public’s hype in driving assistance is fueled mostly by the hype surrounding one car maker and its figurehead and it’s a hype that’s been fueled for a few years and become accepted in the public, reflected in the stock price of that car maker.
Large language models have not yet perpetrated the public’s memory yet, and, what’s actually the point is that inside of language you can find our human culture. And inside a large language model you have essentially the English language with its embeddings. It is real, it is big, it is powerful, it is respectable research.
There’s nothing in driving assistance that can be compared to LLMs. They don’t have an embedding of the entire physical surface of planet earth or understanding of driving physics. They’re nothing.