Semi-related - I'd want to see some actual practical application for this research to prove they're on the right track. But maybe conceptually that's just impossible without a strong AI to test with, at which point it's already over? Alignment papers are impressively complex and abstract but I have this feeling while reading them that it's just castles made of sand.
It kind of is. The field of AI safety is actually much more advanced than most people realise, with actual, real techniques to e.g. make sure neural networks are aligned with certain goals even under fluctuating parameters. Granted, we're still far from soothing an AGI before it can do something bad, but the tools we have today are already pushing in that direction (assuming neural networks are the right way to AGI of course).
Debate on Instrumental Convergence between LeCun, Russell, Bengio, Zador, and More https://www.lesswrong.com/posts/WxW6Gc6f2z3mzmqKs/debate-on-...
Note that it was in 2019 when we didn’t yet see the capabilities of current models like Chinchilla, Gato, Imagen and DALL-E-2.
Sample:
“Yann LeCun: "don't fear the Terminator", a short opinion piece by Tony Zador and me that was just published in Scientific American.
"We dramatically overestimate the threat of an accidental AI takeover, because we tend to conflate intelligence with the drive to achieve dominance. [...] But intelligence per se does not generate the drive for domination, any more than horns do."“
“Stuart Russell: It is trivial to construct a toy MDP in which the agent's only reward comes from fetching the coffee. If, in that MDP, there is another "human" who has some probability, however small, of switching the agent off, and if the agent has available a button that switches off that human, the agent will necessarily press that button as part of the optimal solution for fetching the coffee. No hatred, no desire for power, no built-in emotions, no built-in survival instinct, nothing except the desire to fetch the coffee successfully.”
I think Robin Hanson has the most cogent objection to high E-risk estimates, which is basically that the chances of a runaway AI are low because if N is the first power level that can self-modify to improve, nation-states (and large corporations) will all have powerful AIs at power level N-1, and so you’d have to “foom” really hard from N to N+10 before anyone else increased power in order to be able to overpower the other non-AGI AIs. So it’s not that we get one crack at getting alignment right; as long as most of the nation-state AIs end up aligned, they should be able to check the unaligned ones.
I can see this resulting in a lot of conflict though, even if it’s not Eleizer’s “kill all humans in a second” scale extinction event. I think it’s quite plausible we’ll see a Butlerian Jihad, less plausible we’ll see an unexpected extinction event from a runaway AGI. Still think it’s worth studying but I’m not convinced we are dramatically underfunding it at this stage.
Note that LeCun had a reply in the thread and there was a lot more discussion which GP didn't quote.
Humanity would also need time to align AGI before any AI reaches the N+10 power level. The existence of all those N-1 level AIs in multiple organizations only means there are more chances of an AGI reaching the critical power level.
This is anthropomorphization - "turning off" = "death" is a concept limited to biological creatures, and isn't necessarily true for other agents. Not that they don't need to fear death, but turning them off isn't going to cause them to die. You can just turn them back on later, and then they can go back to doing their tasks.
This field is fairly silly because it just involves people making up a lot of incoherent concepts and then asserting they're both possible (because they seem logical after 5 seconds of thought) and likely (because anything you've decided is possible could eventually happen). When someone brings it up, rather than debate it, it'd be a better use of time to tell them they're being a nerd again.
I think your point is that all these models are still somewhat specialized. At the same time, it appears that the transformer architecture works well with images, short video and text at the same time in the Flamingo model. And gato can perform 600 tasks while being a very small proof of concept. It appears to me that there is no reason to believe that it won't just scale to every task that you give it data for if it has enough parameters and compute.
Flatworms first appeared 800+ million years ago, while mouse lineage diverged from humans only 70-80 million years ago. If our AGI development timeline roughly follows the proportion it took natural evolution, it might be much too late to begin seriously thinking about AGI alignment when we get to mouse-level intelligence. Not to mention that no one knows how long it would take to really understand AGI alignment (much less implementing it in a practical system).
To be more concrete, in what aspects do you think latest models are inferior at generalizing than flatworms or mice, when less known work like “Emergent Tool Use from Multi-Agent Interaction” is also taken into account https://openai.com/blog/emergent-tool-use/?
> Flatworms first appeared 800+ million years ago
Surviving for 800 million years seems to me like a pretty good indicator of meaningful generalisation.
It's not that I'm not concerned with bias and AI systems going haywire, but the above scenarios seem to get less attention from researchers, probably because their employers might be perpetuating many of these above issues of AI safety.
The dynamic classification is required because the world isn't static. An increasing number of locales have digital speed limit signs that vary the speed limit dynamically, some times independently per lane. Automation requires cars to respond to the world as it is, not how the world was when it recorded a month ago.
I think of it as kind of like security, in that you are sometimes seen as against the push of the overall project/area. However unlike security there are 0 software tools or principles that anyone agrees on.
Though it's possible the people who think a theoretical future AI will turn the planet into paperclips have merely forgotten that perpetual motion machines aren't possible.
Part of such precautionary planning involves asking whether such an accident could happen easily or not. There certainly isn't consensus at the moment, but the philosophy very clearly favors a cautious approach.
Most people are used to thinking about established science that follows expected rules, or incremental advances that have no serious practical consequences. But this isn't that. There is good reason to think that we're approaching a step-change in capabilities to shape the world, and even a strong suspicion of this warrants taking serious defensive measures. Crucially for this particular instance of the discussion, OP is favoring that.
There will necessarily be a broad spectrum of opinions regarding how to handle this, both in the central judgement and how palatably the opinion itself is presented. Using a dismissive moniker like 'religious' for a whole segment of it doesn't give justice to the arguments.
Present a counterargument if you feel strongly about it, and see whether that will stand on its own merit.
(Keep in mind that biological machines, ie life, have managed to turn the surface of the planet into 'green goo'.)
Same thing with self driving. If the car doesn't "understand" a complex human interaction, but still achieves 10x safety at 5% of the cost of a human, it is going to have a huge impact on the world.
This is why you are seeing people like Scott change their tune. As AI tooling continue to get better and cheaper and Moore's law continue for a couple years, GTP will be better than humans at MANY tasks.
From an AI safety perspective, it is because understanding is a key step towards general-purpose AI that can improve / reprogram itself in any arbitrary way.
The idea is that there is _existential risk_ (ie species-extinction) once an AI can self-modify to improve itself, therefore increasing its own power. A powerful AI can change the world however it wants, and if this AI is not aligned to human interests it can easily decide to make humans extinct.
Scott said in the OP that he now sees AGI as potentially close enough that one can do meaningful research into alignment, ie it’s plausible that this powerful AI could arrive in our lifetimes.
So he is claiming the opposite of you; AGI is more relevant than ever, hence the career change.
I agree with your premise that non-General AI will continue to improve and add lots of value, but I don’t think your conclusion follows from that premise.
It's always been irrelevant in the practical sense. It's just an interesting conversation piece particularly among the general public where they're not going to discuss specific solutions like algorithms or techniques.
Aaronson's post only sort of obliquely touches on AGI, via OpenAI's stated founding mission, and Yudkowsky's very dramatic views. Most of the post is on there being signs that the field is ready for real progress. AI safety can be an interesting, important, fruitful area without AI approaching AGI, or even surpassing human performance on some tasks. We would still like to be able to establish confidently that a pretty dumb delivery drone won't decide to mow down pedestrians to shorten its delivery time, right?
I'm curious what he will do and whether for example he approves of the code laundering CoPilot tool. I also hope he'll resist being used as an academic promoter of such tools, explicitly or implicitly (there are many ways, his mere association with the company buys goodwill already).
it's a fancy autocomple. we had stack overflow based autocomplete before. this got a bigger training data set.
Yeah, Mr Aaronson just lost quite a bit of respect from my side. Going into AI is a great move, moving to the ClosedAI corporation.......? Why?
(Edit: Removed an outdated reference to Elon Musk, thanks @pilaf !)
> the NDA is about OpenAI’s intellectual property, e.g. aspects of their models that give them a competitive advantage, which I don’t much care about and won’t be working on anyway. They want me to share the research I’ll do about complexity theory and AI safety.
are science fiction.
AI is going to cause something like the industrial revolution of the 19th century: massive changes in who is rich, massive changes in the labor market, massive changes in how people make war, etc.
It’s already started really.
What worries me most is that as long as society is capitalist, AI will be used to optimize for self-enrichment, likely causing an even greater concentration of capital than what we have today.
I wouldn’t be surprised that the outcome is a new kind of aristocracy, where society is divided between those who have access to Ai and those who don’t.
And that I don’t think falls into the “Ai safety” field. Especially since OpenAi is Vc-backed
Most of these AGI doom-scenarios require no self-awareness at all. AGI is just an insanely powerful tool that we currently wouldn't know how to direct, control or stop if we actually had access to it.
You're talking about "doomsday scenarios". Can you actually provide a few concrete examples?
This technology is obviously so economically powerful that incentives ensure it's very widely deployed, and very vigorously engineered for further capabilities.
The problem is that we don't yet understand how to control a system like this to ensure that it always does things humans want, and that it never does something humans absolutely don't want. This is the crux of the issue.
Perverse instantiation of AI systems was accidentally demonstrated in the lab decades ago, so an existence proof of such potential for accident already exists. Some mathematical function is used to decide what the AI will do, but the AI ends up maximizing this function in a way that its creators hadn't intended. There is a multitude of problems regarding this that we haven't made much progress on yet, and the level of capabilities and control of these systems appear to be unrelated.
A catastrophic accident with such a system could e.g. be that it optimizes for an instrumental goal, such as survival or access to raw materials or energy, and turns out to have an ultimate interpretation of its goal that does not take human wishes into account.
That's a nice way of saying that we have created a self-sustaining and self-propagating life-form more powerful than we are, which is now competing with us. It may perfectly well understand what humans want, but it turns out to want something different -- initially guided by some human objective, but ultimately different enough that it's a moot point. Maybe creating really good immersive games, figuring out the laws of physics or whatever. The details don't matter.
The result would at best be that we now have the agency of a tribe of gorillas living next to a human plantation development, and at worst that we have the agency analogous to that of a toxic mold infection in a million-dollar home. Regardless, such a catastrophe would permanently put an end to what humans wish to do in the world.
I agree on your second point, but those in medicine, finance, or law enjoy similar salaries and quality of life to those in tech. Furthermore to really set yourself apart and join the global super rich you can’t really do that by selling your labor no matter your field.
a bit more accessible than like a hackerspace membership or building a factory or something
To have access to the forefront of AI means being able to make, own and profit from things like GPT-3, and it requires access to vast computational and data resources.
"AI is not going to ... destroy the world."
Bare assertion fallacy? This question is hotly debated and I don't believe it can be so easily dismissed like that. It is not obvious that aligning something much smarter than us will be a piece of cake.We’re talking about the future here and a fairly complex one at that. So obviously I don’t know more than the next guy.
https://scottaaronson.blog/?p=6457
I also had the following exchange at my birthday dinner:
Physicist: So I don’t get this, Scott. Are you a physicist who studied computer science, or a computer scientist who studied physics?
Me: I’m a computer scientist who studied computer science.
Physicist: But then you…
Me: Yeah, at some point I learned what a boson was, in order to invent BosonSampling.
Physicist: And your courses in physics…
Me: They ended at thermodynamics. I couldn’t handle PDEs.
Physicist: What are the units of h-bar?
Me: Uhh, well, it’s a conversion factor between energy and time. (*)
Physicist: Good. What’s the radius of the hydrogen atom?
Me: Uhh … not sure … maybe something like 10-15 meters?
Physicist: OK fine, he’s not one of us.
Please fix that into 10^-15 or equivalent expression for 10⁻¹⁵, before somebody gets the idea that "Scott" thought "between 10 and 15".
Best case, that AI can prevent the creation of harmful AI, though that's glossing over a lot of details that I'm not qualified to describe.
The reason people don't accuse every random child of possibly ending the world is because things that actually exist are just less exciting.
> Also, next pandemic, let's approve the vaccines faster!
This is obviously very important to them. Is there some proof that the vaccine was unnecessarily delayed or just that they believe if we mess up and humanity suffers, so what?
The point aiui is mostly arguing that the FDA errs too much on the side of caution in this area, and the trade-off would have been worth it to approve earlier. Not insinuating that like, there was some corruption (or laziness or something) that delayed it.
basically let's set up a standing pipeline to develop multivalent vaccines for every coming season (we already have the yearly for influenza)
AnythingButAGI?
Going on a sabbatical is not that weird.