undefined | Better HN

0 pointsml_basics3y ago0 comments

From the paper:

> Given both the competitive landscape and the safety implications of large-scale models like GPT-4, this report contains no further details about the architecture (including model size), hardware, training compute, dataset construction, training method, or similar.

I'm curious whether they have continued to scale up model size/compute significantly or if they have managed to make significant innovations there.

I just skimmed the paper but seems they are also omitting details about how they actually feed the images in too, which is a shame as a curious outside observer.

0 comments

detrites3y ago

What about the glaring safety implications of the custody of this power being in the hands of a relatively small number of people, any of whom may be compelled at any point to divulge that power to those with bad intentions? Secretly?

Conversely, if all actors are given equal access at the same time, no such lone bad actor can be in a position to maintain a hidden advantage.

OpenAI's actions continue to be more than merely annoying.

6gvONxR4sf7o3y ago

That doesn't make sense to me. Would rather you have it in the hands of people who think a lot about safety, but might be compelled to give it to bad actors, or would you rather just give it to bad actors right away?

It's not a zero-sum game where you can level the playing field and say everything's good.

autoexec3y ago

I'd rather have it in the hands of everybody so that we can decide for ourselves what this means for safety, everyone can benefit from the new technology without restriction, and so that we are not dependent on someone else's benevolence for our protection or for access to powerful new technology.

Leveling the playing field won't instantly make everyone safe, but leaving it uneven certainly doesn't either.

2 more replies

mxkopy3y ago

People who think a lot about safety are the bad actors when 1. there are incentives other than safety at play and 2 . nobody actually knows what safety entails because the tech is so new

dna_polymerase3y ago

> What about the glaring safety implications of the custody of this power being in the hands of a relatively small number of people, any of whom may be compelled at any point to divulge that power to those with bad intentions? Secretly?

What you are looking for is a publication known as "Industrial Society and Its Future"

greggsy3y ago

More commonly known as “ The Unabomber Manifesto”[1]

> 1995 anti-technology essay by Ted Kaczynski… contends that the Industrial Revolution began a harmful process of natural destruction brought about by technology, while forcing humans to adapt to machinery, creating a sociopolitical order that suppresses human freedom and potential.

[1] https://en.wikipedia.org/wiki/Unabomber_Manifesto

1 more reply

beepbooptheory3y ago

I don't really understand.. Pretty sure he wasn't worried about "safety implications" in that. Is this just like a snarky thing? Like having any kind of critiques about technology means you must be allied with the unabomber?

People have spilled a lot more ink than that on this subject! And most of them weren't also terrorists.

diimdeep3y ago

Without paper and architecture, GPT-4 (GPT-3+1) could be just a marketing gimmick to upsell it and in reality it is just microservices of existing A.I models working together as AIaaS (A.I. as a service)

barking_biscuit3y ago

At this point, if it goes from being in the bottom 10% on a simulated bar exam to top 10% on a simulated bar exam, then who cares if that's all they're doing???

cma3y ago

OpenAI writes in the post:

> A minority of the problems in the exams were seen by the model during training

A minority can be 49%. They do mention they tested against newly available practice exams, but those are often based on older real exam questions which may have been discussed extensively in forums that were in the training data. Now that it is for-profit ClosedAI we have to somewhat treat each claim as if it were made adversarially, assuming minority may mean 49% when it would benefit them one way and .1% when it serves their look better for sales pitch to the Microsoft board, etc.

1 more reply

itake3y ago

If they are overfitting, then its not very interesting.

1 more reply

eeY3Eech3y ago

This approach to safety reminds me of The Right to Read, the famous short story by Richard Stallmann. He predicts a dystopian future where private possession of a debugger is illegal. https://www.gnu.org/philosophy/right-to-read.en.html

It is unsafe to not release the source along with the service. That incentivizes competitors to sacrifice their own safety research in favor of speed to market. Instead of getting shared safe tools, we get a bunch of for profit corporations pushing their proprietary unsafe tools.

Preventing this situation was the original reason to setup OpenAI. Speed run to the dark side.

rcme3y ago

I bet they use CLIP to caption the image and feed the text of the caption into GPT, but that's just a guess.

tuvan3y ago

Did you check all of the samples provided? It can read an entire research paper and understand the figures just from the images of the papers pages. This seems to be a much deeper connection than extracting captions.

ionwake3y ago

Are you sure? Sounds too epic

2 more replies

gwern3y ago

CLIP doesn't do captioning, it just generates embeddings. And it's contrastive, so it would work poorly for this kind of task: anything 'relational' falls apart immediately. (See for example the DALL-E 2 results for these kinds of captions/tasks.)

It's almost certainly a VQ-VAE-style encoding of the image itself into a sequence of tokens, as was done by DALL-E 1, CM3, Gato and a whole bunch of more recent models. It's the very obvious thing to do, and their context window is more than large enough now.

GaggiX3y ago

This way the model would also be able to generate images, I would also be curious how they handle images with different aspect ratios (and maybe resolution so it can read well on papers).

_hl_3y ago

There's no need to round-trip through text, you "just" need to train an embedding space that captures both domains.

joshvm3y ago

You can look at Google's recent PaLM-E model for a possible approach. They use a vision transformer to tokenise the image (or to generate embeddings and then tokenise those?) and they also tokenise detected objects so the model can reason at a semantic level. Either way, it's been shown that these massive LLMs can handle images in tokenised form if you pretend it's text. In Google's case, the model is trained to look for sentinel values in the prompt (i.e. <img>) that denote images/objects are being sent.

sebzim45003y ago

They almost certainly generate tokens directly from the image. It would be extremely hard to generate short english descriptions which sufficiently describe the images to pass some of those benchmarks.

iflp3y ago

These are all good reasons, but it’s really a new level of openness from them.

Madmallard3y ago

Open AI more like Closed AI

Safety has nothing to do with it. It's an easy tack on for them because of popular fear of AGI.

It's all about power over the market.

Cringe.

kristianp3y ago

I'm assuming they scaled up the model significantly, given the limited availability of the trained model and the increased pricing. Seems like they don't have enough clusters of A100s to go around at the moment.

kristianp3y ago

Or perhaps the usage restrictions allow openai to improve the "safety" of gpt4 before too many people have access to it.

bagels3y ago

We don't trust you with it. You don't get a choice whether to trust us with it.

OrangeMusic3y ago

> Given both the competitive landscape and the safety implications

Let's be honest, the real reason for closeness is the former.

redbell3y ago

> this report contains no further details about the architecture (including model size), hardware, training compute

As a beginner in the NLP world, this may serve me a purpose which is to hide the complexity behind building such models.. numbers like xyzB parameters, 12K A100s.. are scary, so I still can dream of building one system one day. This story [0] and this one [1] hide some extremely complex edge cases that a beginner will never though of or had the courage to start if he knew what is the real cost.

We may, however, still be able to infer some details [probably in the future] knowing how Microsoft had re-arranged its infrastructure to welcome OpenAI training [2]

_________________

[0]. https://www.construct.net/en/blogs/ashleys-blog-2/simple-sof...

[1]. https://prog21.dadgum.com/29.html

[2]. https://www.theverge.com/2023/3/13/23637675/microsoft-chatgp...

j / k navigate · click thread line to collapse

0 comments

detrites3y ago

Conversely, if all actors are given equal access at the same time, no such lone bad actor can be in a position to maintain a hidden advantage.

OpenAI's actions continue to be more than merely annoying.

6gvONxR4sf7o3y ago

It's not a zero-sum game where you can level the playing field and say everything's good.

autoexec3y ago

Leveling the playing field won't instantly make everyone safe, but leaving it uneven certainly doesn't either.

2 more replies

mxkopy3y ago

People who think a lot about safety are the bad actors when 1. there are incentives other than safety at play and 2 . nobody actually knows what safety entails because the tech is so new

dna_polymerase3y ago

What you are looking for is a publication known as "Industrial Society and Its Future"

greggsy3y ago

More commonly known as “ The Unabomber Manifesto”[1]

[1] https://en.wikipedia.org/wiki/Unabomber_Manifesto

1 more reply

beepbooptheory3y ago

People have spilled a lot more ink than that on this subject! And most of them weren't also terrorists.

diimdeep3y ago

barking_biscuit3y ago

At this point, if it goes from being in the bottom 10% on a simulated bar exam to top 10% on a simulated bar exam, then who cares if that's all they're doing???

cma3y ago

OpenAI writes in the post:

> A minority of the problems in the exams were seen by the model during training

1 more reply

itake3y ago

If they are overfitting, then its not very interesting.

1 more reply

eeY3Eech3y ago

Preventing this situation was the original reason to setup OpenAI. Speed run to the dark side.

rcme3y ago

I bet they use CLIP to caption the image and feed the text of the caption into GPT, but that's just a guess.

tuvan3y ago

ionwake3y ago

Are you sure? Sounds too epic

2 more replies

gwern3y ago

GaggiX3y ago

This way the model would also be able to generate images, I would also be curious how they handle images with different aspect ratios (and maybe resolution so it can read well on papers).

_hl_3y ago

There's no need to round-trip through text, you "just" need to train an embedding space that captures both domains.

joshvm3y ago

sebzim45003y ago

iflp3y ago

These are all good reasons, but it’s really a new level of openness from them.

Madmallard3y ago

Open AI more like Closed AI

Safety has nothing to do with it. It's an easy tack on for them because of popular fear of AGI.

It's all about power over the market.

Cringe.

kristianp3y ago

Or perhaps the usage restrictions allow openai to improve the "safety" of gpt4 before too many people have access to it.

bagels3y ago

We don't trust you with it. You don't get a choice whether to trust us with it.

OrangeMusic3y ago

> Given both the competitive landscape and the safety implications

Let's be honest, the real reason for closeness is the former.

redbell3y ago

> this report contains no further details about the architecture (including model size), hardware, training compute

We may, however, still be able to infer some details [probably in the future] knowing how Microsoft had re-arranged its infrastructure to welcome OpenAI training [2]

_________________

[0]. https://www.construct.net/en/blogs/ashleys-blog-2/simple-sof...

[1]. https://prog21.dadgum.com/29.html

[2]. https://www.theverge.com/2023/3/13/23637675/microsoft-chatgp...

j / k navigate · click thread line to collapse