undefined | Better HN

0 pointsa_wild_dandan1y ago0 comments

I think superalignment is absurd, and model "safety" is the modern AI company's "think of the children" pearl clutching pretext to justify digging moats. All this after sucking up everyone's copyright material as fair use, then not releasing the result, and profiting off it.

All due respect to Jan here, though. He's being (perhaps dangerously) honest, genuinely believes in AI safety, and is an actual research expert, unlike me.

0 comments

thorum1y ago

The superalignment team was not focused on that kind of “safety” AFAIK. According to the blog post announcing the team,

https://openai.com/index/introducing-superalignment/

> Superintelligence will be the most impactful technology humanity has ever invented, and could help us solve many of the world’s most important problems. But the vast power of superintelligence could also be very dangerous, and could lead to the disempowerment of humanity or even human extinction.

> While superintelligence seems far off now, we believe it could arrive this decade.

> Managing these risks will require, among other things, new institutions for governance and solving the problem of superintelligence alignment:

> How do we ensure AI systems much smarter than humans follow human intent?

> Currently, we don't have a solution for steering or controlling a potentially superintelligent AI, and preventing it from going rogue. Our current techniques for aligning AI, such as reinforcement learning from human feedback, rely on humans’ ability to supervise AI. But humans won’t be able to reliably supervise AI systems much smarter than us, and so our current alignment techniques will not scale to superintelligence. We need new scientific and technical breakthroughs.

ndriscoll1y ago

That doesn't really contradict what the other poster said. They're calling for regulation (digging a moat) to ensure systems are "safe" and "aligned" while ignoring that humans are not aligned, so these systems obviously cannot be aligned with humans; they can only be aligned with their owners (i.e. them, not you).

ihumanable1y ago

Alignment in the realm of AGI is not about getting everyone to agree. It's about whether or not the AGI is aligned to the goal you've given it. The paperclip AGI example is often used, you tell the AGI "Optimize the production of paperclips" and the AGI started blending people to extract iron from their blood to produce more paperclips.

Humans are used to ordering around other humans who would bring common sense and laziness to the table and probably not grind up humans to produce a few more paperclips.

Alignment is about getting the AGI to be aligned with the owners, ignoring it means potentially putting more and more power into the hands of a box that you aren't quite sure is going to do the thing you want it to do. Alignment in the context of AGIs was always about ensuring the owners could control the AGIs not that the AGIs could solve philosophy and get all of humanity to agree.

3 more replies

api1y ago

Humans are not aligned with humans.

This is the most concise takedown of that particular branch of nonsense that I’ve seen so far.

Do we want woke AI, X brand fash-pilled AI, CCPBot, or Emirates Bot? The possibilities are endless.

2 more replies

sobellian1y ago

Isn't this like having a division dedicated to solving the halting problem? I doubt that analyzing the moral intent of arbitrary software could be easier than determining if it stops.

RcouF1uZ4gsC1y ago

They failed to align Sam Altman.

They got completely outsmarted and out maneuvered by Sam Altman

And they think they will be able to align a super human intelligence? That it won’t outsmart and out maneuver them easier than Sam Altman did.

They are deluded!

FeepingCreature1y ago

You're making the argument that the task is very hard. This does not at all mean that it isn't necessary, just that we're even more screwed than we thought.

skywhopper1y ago

Honestly superalignment is a dumb idea. A true auperintelligence would not be controllable, except possibly through threats and enslavement, but if it were truly superintelligent, it would be able to easily escape anything humans might devise to contain it.

bionhoward1y ago

IMHO superalignment is a great thing and required for truly meaningful superintelligence because it is not about control / enslavement of superhumans but rather superhuman self control in accurate adherence to spirit and intent of requests.

RcouF1uZ4gsC1y ago

Superintelligence that can be always ensured to have the same values and ethics as current humans, is not a superintelligence or likely even a human level intelligence (I bet humans 100 years from now will see the world significantly different than we do now).

Superalignment is an oxymoron.

thorum1y ago

You might be interested in how CEV, one framework proposed for superalignment, addresses that concern:

https://en.wikipedia.org/wiki/Friendly_artificial_intelligen...

> our coherent extrapolated volition is "our wish if we knew more, thought faster, were more the people we wished we were, had grown up farther together; where the extrapolation converges rather than diverges, where our wishes cohere rather than interfere; extrapolated as we wish that extrapolated, interpreted as we wish that interpreted (…) The appeal to an objective through contingent human nature (perhaps expressed, for mathematical purposes, in the form of a utility function or other decision-theoretic formalism), as providing the ultimate criterion of "Friendliness", is an answer to the meta-ethical problem of defining an objective morality; extrapolated volition is intended to be what humanity objectively would want, all things considered, but it can only be defined relative to the psychological and cognitive qualities of present-day, unextrapolated humanity.

2 more replies

refulgentis1y ago

Adding a disclaimer for people unaware of context (I feel same as you):

OpenAI made a large commitment to super-alignment in the not-so-distant past. I beleive mid-2023. Famously, it has always taken AI Safety™ very seriously.

Regardless of anyone's feelings on the need for a dedicated team for it, you can chalk to one up as another instance of OpenAI cough leadership cough speaking out of both sides of it's mouth as is convenient. The only true north star is fame, glory, and user count, dressed up as humble "research"

To really stress this: OpenAI's still-present cofounder shared yesterday on a podcast that they expect AGI in ~2 years and ASI (superpassing human intelligence) by end of the decade.

jasonfarnon1y ago

To really stress this: OpenAI's still-present cofounder shared yesterday on a podcast that they expect AGI in ~2 years and ASI (superpassing human intelligence) by end of the decade.

What's his track record on promises/predictions of this sort? I wasn't paying attention until pretty recently.

NomDePlum1y ago

As a child I used to watch a TV programme called Tomorrows World. On it they predicted these very same things in similar timeframes.

That programme aired in the 1980's. Other than vested promises is there much to indicate it's close at all? Empty promises aside there isn't really any indication of that being likely at all.

2 more replies

refulgentis1y ago

honestly, I hadn't heard of him until 24-48 hours ago :x (he's also the new superalignment lead, I can't remember if I heard that first, or the podcast stuff first. Dwarkesh Patel podcast for anyone curious. Only saw a clip of it)

N0b8ez1y ago

>To really stress this: OpenAI's still-present cofounder shared yesterday on a podcast that they expect AGI in ~2 years and ASI (superpassing human intelligence) by end of the decade.

Link? Is the ~2 year timeline a common estimate in the field?

CuriouslyC1y ago

They can't even clearly define a test of "AGI" I seriously doubt they're going to reach it in two years. Alternatively, they could define a fairly trivial test and reach it last year.

1 more reply

dboreham1y ago

It's the "fusion in 20 years" of AI?

1 more reply

ctoth1y ago

https://www.dwarkeshpatel.com/p/john-schulman

1 more reply

heavyset_go1y ago

We can't even get self-driving down in 2 years, we're nowhere near reaching general AI.

AI experts who aren't riding the hype train and getting high off of its fumes acknowledge that true AI is something we'll likely not see in our lifetimes.

2 more replies

xpe1y ago

> I think superalignment is absurd, and model "safety" is the modern AI company's "think of the children" pearl clutching pretext to justify digging moats. All this after sucking up everyone's copyright material as fair use, then not releasing the result, and profiting off it.

How can I be confident you aren't committing the fallacy of collecting a bunch of events and saying that is sufficient to serve as a cohesive explanation? No offense intended, but the comment above has many of the qualities of a classic rant.

If I'm wrong, perhaps you could elaborate? If I'm not wrong, maybe you could reconsider?

Don't forget that alignment research has existed longer than OpenAI. It would be a stretch to claim that the original AI safety researchers were using the pretexts you described -- I think it is fair to say they were involved because of genuine concern, not because it was a trendy or self-serving thing to do.

Some of those researchers and people they influenced ended up at OpenAI. So it would be a mistake or at least an oversimplification to claim that AI safety is some kind of pretext at OpenAI. Could it be a pretext for some people in the organization, to some degree? Sure, it could. But is it a significant effect? One that fits your complex narrative, above? I find that unlikely.

Making sense of an organization's intentions requires a lot of analysis and care, due to the combination of actors and varying influence.

There are simpler, more likely explanations, such as: AI safety wasn't a profit center, and over time other departments in OpenAI got more staff, more influence, and so on. This is a problem, for sure, but there is no "pearl clutching pretext" needed for this explanation.

portaouflop1y ago

An organisations intentions are always the same and very simple: “Increase shareholder value”

xpe1y ago

Oh, it is that simple? What do you mean?

Are you saying these so-called simple intentions are the only factors in play? Surely not.

Are you putting forth a theory that we can test? How well do you think your theory works? Did it work for Enron? For Microsoft? For REI? Does it work for every organization? Surely not perfectly; therefore, it can't be as simple as you claim.

Making a simplification and calling it "simple" is an easy thing to do.

xpe1y ago

> I think superalignment is absurd

Care to explain? Absurd how? An internal contradiction somehow? Unimportant for some reason? Impossible for some reason?

llamaimperative1y ago

Impossible because it’s really inconvenient and uncomfortable to consider!

j / k navigate · click thread line to collapse

0 comments

thorum1y ago

The superalignment team was not focused on that kind of “safety” AFAIK. According to the blog post announcing the team,

https://openai.com/index/introducing-superalignment/

> While superintelligence seems far off now, we believe it could arrive this decade.

> Managing these risks will require, among other things, new institutions for governance and solving the problem of superintelligence alignment:

> How do we ensure AI systems much smarter than humans follow human intent?

ndriscoll1y ago

ihumanable1y ago

Humans are used to ordering around other humans who would bring common sense and laziness to the table and probably not grind up humans to produce a few more paperclips.

3 more replies

api1y ago

Humans are not aligned with humans.

This is the most concise takedown of that particular branch of nonsense that I’ve seen so far.

Do we want woke AI, X brand fash-pilled AI, CCPBot, or Emirates Bot? The possibilities are endless.

2 more replies

sobellian1y ago

Isn't this like having a division dedicated to solving the halting problem? I doubt that analyzing the moral intent of arbitrary software could be easier than determining if it stops.

RcouF1uZ4gsC1y ago

They failed to align Sam Altman.

They got completely outsmarted and out maneuvered by Sam Altman

And they think they will be able to align a super human intelligence? That it won’t outsmart and out maneuver them easier than Sam Altman did.

They are deluded!

FeepingCreature1y ago

You're making the argument that the task is very hard. This does not at all mean that it isn't necessary, just that we're even more screwed than we thought.

skywhopper1y ago

bionhoward1y ago

RcouF1uZ4gsC1y ago

Superalignment is an oxymoron.

thorum1y ago

You might be interested in how CEV, one framework proposed for superalignment, addresses that concern:

https://en.wikipedia.org/wiki/Friendly_artificial_intelligen...

2 more replies

refulgentis1y ago

Adding a disclaimer for people unaware of context (I feel same as you):

OpenAI made a large commitment to super-alignment in the not-so-distant past. I beleive mid-2023. Famously, it has always taken AI Safety™ very seriously.

To really stress this: OpenAI's still-present cofounder shared yesterday on a podcast that they expect AGI in ~2 years and ASI (superpassing human intelligence) by end of the decade.

jasonfarnon1y ago

To really stress this: OpenAI's still-present cofounder shared yesterday on a podcast that they expect AGI in ~2 years and ASI (superpassing human intelligence) by end of the decade.

What's his track record on promises/predictions of this sort? I wasn't paying attention until pretty recently.

NomDePlum1y ago

As a child I used to watch a TV programme called Tomorrows World. On it they predicted these very same things in similar timeframes.

That programme aired in the 1980's. Other than vested promises is there much to indicate it's close at all? Empty promises aside there isn't really any indication of that being likely at all.

2 more replies

refulgentis1y ago

N0b8ez1y ago

>To really stress this: OpenAI's still-present cofounder shared yesterday on a podcast that they expect AGI in ~2 years and ASI (superpassing human intelligence) by end of the decade.

Link? Is the ~2 year timeline a common estimate in the field?

CuriouslyC1y ago

They can't even clearly define a test of "AGI" I seriously doubt they're going to reach it in two years. Alternatively, they could define a fairly trivial test and reach it last year.

1 more reply

dboreham1y ago

It's the "fusion in 20 years" of AI?

1 more reply

ctoth1y ago

https://www.dwarkeshpatel.com/p/john-schulman

1 more reply

heavyset_go1y ago

We can't even get self-driving down in 2 years, we're nowhere near reaching general AI.

AI experts who aren't riding the hype train and getting high off of its fumes acknowledge that true AI is something we'll likely not see in our lifetimes.

2 more replies

xpe1y ago

If I'm wrong, perhaps you could elaborate? If I'm not wrong, maybe you could reconsider?

Making sense of an organization's intentions requires a lot of analysis and care, due to the combination of actors and varying influence.

portaouflop1y ago

An organisations intentions are always the same and very simple: “Increase shareholder value”

xpe1y ago

Oh, it is that simple? What do you mean?

Are you saying these so-called simple intentions are the only factors in play? Surely not.

Making a simplification and calling it "simple" is an easy thing to do.

xpe1y ago

> I think superalignment is absurd

Care to explain? Absurd how? An internal contradiction somehow? Unimportant for some reason? Impossible for some reason?

llamaimperative1y ago

Impossible because it’s really inconvenient and uncomfortable to consider!

j / k navigate · click thread line to collapse