OpenAI's plans according to sama (opens in new tab)

(humanloop.com)

313 pointsrazcle2y ago258 comments

258 comments

> He reiterated his belief in the importance of open source and said that OpenAI was considering open-sourcing GPT-3. Part of the reason they hadn’t open-sourced yet was that he was skeptical of how many individuals and companies would have the capability to host and serve large LLMs.

Am I reading this right? "We're not open sourcing GPT-3 because we don't think it would be useful to anyone else"

TapWaterBandit2y ago

When you stop listening to what Sam Altman says and just focus on what he does, you can see the guy is a bit of a snake. Greedy power-hungry man imho.

rafark2y ago

I’ve tried very hard to like him because like it or not, ChatGPT has revolutionized the AI industry but he’s so hypocritical I just can’t stand him.

1 more reply

seattle_spring2y ago

I know it doesn't seem related on the surface, but I've found all startup CEOs who ban remote work to be snake-like and dishonest. Interesting coincidence at least.

2 more replies

courseofaction2y ago

I can't judge the individual, but his words do not align with the company's actions in the slightest.

>> 4. OpenAI will avoid competing with their customers — other than with ChatGPT

On this I would not bet a dime.

1 more reply

neximo642y ago

In your world how would you consider Sam Altman having no equity in OpenAI? And everyone finding out after it had a viral hit

2 more replies

mellosouls2y ago

Sam Altman is responsible for leading the team that have revolutionised AI in its position within society.

There is plenty to criticise OpenAI for but what he and they have achieved is extraordinary, and there is no need for that sort of toxic personal attack.

4 more replies

bane2y ago

Most of the companies I work with are actively putting in place policies to prevent employees from using OpenAI's service because nobody wants to send their proprietary IP to them.

Almost all of these companies have the technical ability, desire, and means to self-host for their employee community. Imagine the internal coup for CTO/CIOs everywhere to buy whatever is the latest Nvidia GPU cluster box, stick it in the on-prem datacenter, load a licensed GPT model and provide "AI as a service to our employees".

Except what's happening is everybody is looking at buying the box from Nvidia, and sticking a large actually open model on it and simply ignoring OpenAI.

f6v2y ago

> Almost all of these companies have the technical ability, desire, and means to self-host for their employee community.

Well, one of companies I worked for could have hosted a canary service for cron jobs. But we bought it instead of building because we were focused on building features. And here you’re talking about hosting an entire LLM.

1 more reply

candiddevmike2y ago

OpenAI: Regulations must be passed to protect our moat

Also OpenAI: Meta is pissing in our moat, let's drop a hint about open sourcing our shit too!

razcleOP2y ago

I think I worded this poorly. What he said was that a lot of people say they want open-source models but they underestimate how hard it is to serve them well. So he wondered how much real benefit would come from open-sourcing them.

I think this is reasonable. Giving researchers access is great but for most small companies they're likely better off having a service provider manage inference for them rather than navigate the infra challenge.

roganartu2y ago

The beauty of open source is that the community will either figure out how to make it easier, or collectively decide it’s not worth the effort. We saw this with stable diffusion, and we are seeing it with all the existing OSS LLMs.

“It’s too hard, trust us” doesn’t really make sense in that context. If it is indeed too hard for small orgs to self host then they won’t. Hiding behind the guise of protecting these people by not open sourcing it seems a bit disingenuous.

choppaface2y ago

Here is how hard it is to serve and use LLMs: https://github.com/ggerganov/llama.cpp

1 more reply

solarkraft2y ago

You're saying the same thing.

"I'm not sharing my chocolate with you because you probably wouldn't like it"

antupis2y ago

If it goes same way as other open sourced models it takes about 5 days that someone will get it running at m1.

rushingcreek2y ago

If he says he's inclined to open-source GPT-3, I don't see any good arguments not in favor of giving startups the choice of how they can run inference.

paxys2y ago

More like – it won't be useful to small-time developers (since they won't have the capability to host and run it themselves) and so all the benefits will be reaped by AWS and other large players.

sheepscreek2y ago

This is what I understood as well. They want to either democratize adoption or not release it. The last thing they/anyone wants is for another BigCo or Govt to h take undue advantage of the model (through fine-tuning?) when others cannot.

That said, I can imagine a GPTQ/4-bit quantized model to be smaller and easier to run on somewhat commodity clusters?

Or it could run with GGML/llama.cpp on a cloud instance with a TB of RAM?

After seeing what people were able to do with LLaMA, I am positive that the community will find a way to run it - albeit with some loss in performance.

It would be truly amazing if they used their computing to develop quantized models as well.

renonce2y ago

A big chunk of developments based on Facebook’s LLaMA model are by small-time developers and individuals, not large players. Facebook has already shown a viable way to release models in the way you described.

lostmsu2y ago

If you really need, 170B parameter model can infer a few tokens per minute on commodity hardware.

sebzim45002y ago

It is weird, but GPT-3 is worse than much smaller LLaMA models so I doubt it would see much use anyway.

killjoywashere2y ago

How do you measure this? Pointers to papers would be very helpful

1 more reply

flangola72y ago

Are you referring to DaVinci or ChatGPT-3.5

1 more reply

braindead_in2y ago

It is a shame that Sama does not believe in Open Source. The community can solve their GPU bottleneck issue by making it run on CPUs and edge devices in a matter of days.

barbariangrunge2y ago

If small organizations and teams can’t use it, then open sourcing it mostly just benefits big tech

That’s not ideal

How does open source licensing work with respect to trained ai models anyway? Is something like the mit license even that valuable here? Or is it?

sokoloff2y ago

If the only barrier to a small team/org using it is cost/effort of hosting (as opposed to some licensing shenanigans), I fail to see how not releasing it is better for the world than releasing it would be. Even if it benefits big tech more than a small team.

Am I somehow being protected by a benevolent sama not open-sourcing the model?

bibanez2y ago

I agree, this is so bizarre

ftxbro2y ago

yes i also can't wrap my head around how a ceo of a billion dollar company isn't sincere in his public statements

1 more reply

RosanaAnaDana2y ago

Its just a way to lie that doesn't sound as much like a lie.

greenie_beans2y ago

lmao i had the same reaction. sounds like some bullshit.

stavros2y ago

Reads to me like "we don't know how many people will have hardware powerful enough to run this".

ryanmercer2y ago

Exactly. If you make it open source, great, cool, but only well-funded entities - like massive corporations - can even afford the hardware costs.

1 more reply

xxprogamerxy2y ago

He wants the release of the model to primarily benefit individuals and smaller teams as opposed to large deep-pocketed firms.

rurp2y ago

And he'll do that by... keeping ChatGPT models away from individuals and small teams and in the hands of a few large deep-pocketed firms?

The great thing about open source is that people can try different approaches and gravitate towards what works best for them. Sam knows that of course, he's just being disingenuous because the truth makes him look bad.

TigeriusKirk2y ago

How can you sign a statement that AI presents an extinction risk on par with nuclear weapons and then even consider open sourcing your research?

We don't provide nuclear weapons for everyone to keep in their basement, why would someone who believes AI is an existential risk provide their code?

purplecats2y ago

> Cheaper and faster GPT-4 — This is their top priority. In general, OpenAI’s aim is to drive “the cost of intelligence” down as far as possible and so they will work hard to continue to reduce the cost of the APIs over time.

this certainly aligns with the massive (albeit subjective and anecdotal) degradation in quality i've experienced with ChatGPT GPT-4 over the past few weeks.

hopefully a superior (higher quality) alternative surfaces before its unusable. i'm not considering continuing my subscription at this rate.

brucethemoose22y ago

Anthropic's Claude is said to be very good.

Instruction tuned LLaMA 65B/Falcon 40B are good, especially with an embeddings database.

...But OpenAI has all the name recognition and ease of use now, so it might not even matter if others ambiguously surpass OpenAI models.

legendofbrando2y ago

The problem with Claude is that it is quite literally impossible to get off the waiting list to use it. To OpenAI’s credit they actually ship the product in an accessible way to developers.

2 more replies

timeserious2y ago

Can you ELI5 why an embeddings database helps here? Can pinecone/milvus be used to 'extend memory' of OSS and vendor LLMs without retraining?

2 more replies

poulpy1232y ago

I don't have access to GPT-4 but claude is competiting with gpt-3.5 (chatgpt) and bingAI (whatever they use)

theonlybutlet2y ago

Has anyone compared the ChatGPT GPT-4 performance in the plus subscription to that of the API? Has the API performance deteriorated just as much? It would be strange if it did as I'd assume the models costs are priced in there.

laratiede2y ago

People state this as if it is fact when there is no good way to measure this.

I have had random runs of good days and bad days since starting to use chatGPT.

rafark2y ago

Exactly. It’s become ‘cheap’. This is why we need more good competition.

nonethewiser2y ago

I wonder of it actually is because they’re tuning it to make it less offensive (by their standards). Thats the only explanation I keep seeing repeated.

reaperman2y ago

I would be very surprised. Things that are very, very far from that are also much worse. I'm having difficulty finding the difference between GPT-3.5 and GPT-4 for a lot of my programming tasks lately. It's noticeably degraded.

littlestymaar2y ago

That's a convenient explanation that's been repeated over and over by certain people, but cost is a much more likely explanation: inference for large models is extraordinarily expensive when you have millions of users and their pricing model always seemed way too low to pay for that.

They have likely been subsidizing their users since the launch of their commercial offering (and this is pretty common strategy for SV startups) but they've been so successful that they now need to scale the cost down in order not to burn all their cash too fast.

ChatGTP2y ago

Should "intelligence" have ever costed anything?

It's like saying "air should cost money".

Sharlin2y ago

This just in: smart people should work for free.

1 more reply

londons_explore2y ago

> is limited by GPU availability.

Which is all the more curious, considering OpenAI said this only in January:

> Azure will remain the exclusive cloud provider for all OpenAI workloads across our research, API and products [1]

So... OpenAI is severely GPU constrained, it is hampering their ability to execute, onboard customers to existing products and launch products. Yet they signed an agreement not to just go rent a bunch of GPU's from AWS???

Did someone screw up by not putting a clause in that contract saying "exclusive cloud provider, unless you cannot fulfil our requests"?

[1]: https://openai.com/blog/openai-and-microsoft-extend-partners...

HarHarVeryFunny2y ago

There's an interesting recent video here from Microsoft discussing Azure. The format is a bit cheesy, but lots of interesting information nonetheless.

https://www.youtube.com/watch?v=Rk3nTUfRZmo&t=5s "What runs ChatGPT? Inside Microsoft's AI supercomputer"

The relevance here is that Azure appears to be very well designed to handle the hardware failures that will inevitably happen during a training run taking weeks or months and using many thousands of GPUs... There's a lot more involved than just renting a bunch of Amazon GPUs, and anyways the partnership between OpenAI and Microsoft appears quite strategic, and can handle some build-out delays, especially if they are not Microsoft's fault.

ma2rten2y ago

That is only relevant for serving and not for inference, unless the model is too big to fit on a single host (typically 8 GPUs).

jiggawatts2y ago

One of Azure's unique offerings is very large HPC clusters with GPUs. You can deploy ~1,000 node scale sets with very high speed networking. AWS has many single-server GPU offerings, but nothing quite like what Azure has.

Don't assume Microsoft is bad at everything and that AWS is automatically superior at all product categories...

JeremyNT2y ago

Whether MS is good or not isn't really the point. If they're constrained by GPU availability, being locked in to any specific provider is going to be a problem.

renonce2y ago

Large scale sets are only needed for training. For inference, 8x NVIDIA A100 80G will allow inference for 300b models (GPT-3 is 175b) or 1200b models with 4-bit quantization (quantization impact is negligible for large models), so a single machine is sufficient.

sebzim45002y ago

>So... OpenAI is severely GPU constrained, it is hampering their ability to execute, onboard customers to existing products and launch products. Yet they signed an agreement not to just go rent a bunch of GPU's from AWS???

> Did someone screw up by not putting a clause in that contract saying "exclusive cloud provider, unless you cannot fulfil our requests"?

Maybe MSFT refused to sign such an agreement?

londons_explore2y ago

Perhaps they are cash flow constrained, which in turn means they are GPU constrained, since GPU's are their biggest expense?

dbmnt2y ago

I don't think Amazon offers what Azure does (yet) in terms of HPC or multi-GPU capacity. The blog post doesn't say how long the agreement is for, but the relationship probably makes sense at the moment.

All the cloud providers are building out this type of capacity right now. It's already having a big impact in terms of quarterly spend, which we just saw in the NVDA Q1 results. AWS, Azure, and GCP for sure, but also smaller players like Dell and HPE and even NVidia themselves are trying to get into this market. (Disclaimer: I work at one of these places but don't feel like saying which). I suspect the GPU constraints won't be around too long, at which point we'll find out if OpenAI made a contractual mistake.

ilaksh2y ago

AWS might not really have much extra GPU capacity for them anyway.. also they would cost more.

I think that there aren't a lot of GPUs available and it takes time to add more to the datacenter even when you do get them.

carom2y ago

I heard earlier this year that people were having trouble getting allocations on GCP as well. Probably why Nvidia is at $1T now.

doctor_eval2y ago

Let’s not forget that Microsoft is a big investor in OpenAI. It is important to know on which side your bread is buttered.

chaostheory2y ago

Even if they weren’t exclusive with Azure, aren’t GPU prices reasonable again?

verdverm2y ago

They have to be a available to buy, regardless the price. My understanding is there is a distinct lack of supply

1 more reply

catchnear43212y ago

this has nothing to do with sama clamoring for regulation.

that absolutely isn’t an attempt to slow down all competition.

which isn’t necessary because nobody made such a mistake.

this won’t lead to any hasty or reckless internal decisions in a feckless effort to stay in front.

not that any have already been made.

not that that could lead to disaster.

Imnimo2y ago

>The fact that scaling continues to work has significant implications for the timelines of AGI development. The scaling hypothesis is the idea that we may have most of the pieces in place needed to build AGI and that most of the remaining work will be taking existing methods and scaling them up to larger models and bigger datasets. If the era of scaling was over then we should probably expect AGI to be much further away. The fact the scaling laws continue to hold is strongly suggestive of shorter timelines.

If you understand the shape of the power law scaling curves, shouldn't this scaling hypothesis tell you that AGI is not close, at least via a path of simply scaling up GPT-4? For example, the GPT-4 paper reports a 67% pass-rate on the HumanEval benchmark. In Figure 2, they show a power-law improvement on a medium-difficulty subset as a function of total compute. How many powers of ten are we going to increase GPT-4 compute by just to be able to solve some relatively simple programming problems?

askkk2y ago

I always enjoy reading some of your comments, they ameliorate the hype about LLM and give a critical review. Anyway, I think a stronger model than GPT-4 could improve the way to use tools, so that the model is able to self-improve using tools. For example using all kind of solvers and heuristics to guide the model. I don't know how to estimate that risk just now.

Edited: Don't know if is a good thing to study the weak points of closed LLMs. Even asking LLMs can give hints about possible ways to improve. In my case I am happy I am certainly old and my mind is a lot weaker than before, but even in this case I prefer not to use LLMs for gaining insight because she will someday get a better insight than myself. But the lust of knowledge is a mortal sin.

sanxiyn2y ago

Someone did that calculation and the result is here: https://www.reddit.com/r/slatestarcodex/comments/13u40yf/

100x GPT-4 to 85%.

Imnimo2y ago

And, if I'm reading their calculation right, that's 85% on the medium-difficulty bucket, not even the entire HumanEval benchmark?

(quoting from the GPT-4 paper):

>All but the 15 hardest HumanEval problems were split into 6 difficulty buckets based on the performance of smaller models. The results on the 3rd easiest bucket are shown in Figure 2

1 more reply

hervature2y ago

I never know if I have an inside scoop or an outside scoop. Has Hyena not addressed the scaling of context length [1]? I know this version is barely a month old but it was shared to me by a non-engineer the week it came out. Still, giving interviews where the person takes away that the main limitation is context length and requires a big breakthrough that already happened makes me seriously question whether or not he is qualified to speak on behalf of OpenAI. Maybe he and OpenAI are far beyond this paper and know it does not work but surely it should be addressed?

[1] - https://arxiv.org/pdf/2302.10866.pdf

arugulum2y ago

As someone who is in the field: papers proposing to solve the context length problem come out every month. Almost none of the solutions stick or work as well as a dense or mostly dense model.

You'll know when the problem is solved when model after consistently use a method. Until then (and especially if you're not in the field as a researcher), assume that every paper claiming to tackle context length is simply a nice proposal.

dr_dshiv2y ago

What about Meta’s megabyte? Also nice proposal?

1 more reply

dr_dshiv2y ago

> OpenAI will avoid competing with their customers — other than with ChatGPT. Quite a few developers said they were nervous about building with the OpenAI APIs when OpenAI might end up releasing products that are competitive to them. Sam said that OpenAI would not release more products beyond ChatGPT. He said there was a history of great platform companies having a killer app and that ChatGPT would allow them to make the APIs better by being customers of their own product. The vision for ChatGPT is to be a super smart assistant for work but there will be a lot of other GPT use-cases that OpenAI won’t touch.

Can anyone elaborate on this? This is a big issue for me.

jiggawatts2y ago

Is this guy Aes Sedai?

Technically he can claim that OpenAI will not release competing products while Microsoft plugs AI into everything.

Microsoft just announced at Build 2023 that they'll have OpenAI tech integrated with: Windows, Bing, Outlook, Word, Teams, Visual Studio, Visual Studio Code, Microsoft Fabric, Dynamics, GitHub, Azure DevOps, and Logic Apps. I probably missed a bunch.

Very soon now, everything Microsoft sells will have OpenAI integration.

Unless you're selling a niche product too small for Microsoft to bother with, you're competing directly against OpenAI.

Oh, and to top it off: Microsoft can use GPT 4 all they want, via API access. Third parties have to beg and plead to get rate-limited access. That access can be withdrawn at any time if you're doing something unsafe to OpenAI's profit margins.

"Please Sir Sam, may I have some GPT please?"

"No."

edanm2y ago

> Is this guy Aes Sedai?

Haha having just finished the Wheel of Time, I'm super tickled by this reference.

It doesn't seem to be too common, only two uses of it on HN in the past year (at least, found by searching for the phrase "Aes Sedai")

1 more reply

ilaksh2y ago

I think the tricky part for me is that "work" is extremely broad and now that ChatGPT has plugins, it can kind of do anything. Heh.

jeffybefffy5192y ago

Title needs an update as Sama is also the name of the company which helped classify training data for ChatGPT: https://time.com/6247678/openai-chatgpt-kenya-workers/

pindab0ter2y ago

I agree. Not using the actual name is also gate keeping for anyone not familiar enough. The fact that “sama” isn’t even capitalised adds to this.

Sharlin2y ago

Never mind the fact that it’s against HN guidelines to modify original titles for no reason. Changing Sam Altman to sama is just ridiculous.

asnyder2y ago

His statements on open sourcing in this interview/write-up is somewhat in conflict with his recent statement made last week in Munich https://youtu.be/uaQZIK9gvNo?t=1170, where he explicitly said the Frontier of GPT won't be open sourced due to what they perceive as safety reasons, https://youtu.be/uaQZIK9gvNo?t=1170 (19:30 - 22:00).

ilaksh2y ago

He was talking about open sourcing GPT-3. That is not the frontier.

The frontier is the multimodal versions of GPT-4 which he just said wasn't even going to public release until next year. Or whatever they are on now which they are carefully not calling GPT-5.

muskmusk2y ago

I don't see the conflict. They see current models as mostly harmless, but what comes next is dangerous.

It sounds a little too much sci-fi for me, but I guess he knows better.

wintogreen742y ago

plus this conveniently pairs with "we don't need to regulate current models, but future models... oh boy do those need to be regulated!"

ftxbro2y ago

it's legal to make contradictory statements that's one of the job of a ceo and it's why they aren't usually overly literal types you know the kind i'm talking about

sovietmudkipz2y ago

I’m hoping GPT will remove the information cutoff date. I write plenty of terraform/AWS and it’s a bit of a pain that the latest API isn’t accessible by GPT yet.

There’s been quite a bit happening in the programming space since sept 2021.

I use GPT to keep things high level and then do my normal research methodology for implementation details.

furyofantares2y ago

It's not like an arbitrary imposition, that's the data it was trained on and it's expensive to train. I hope they find a way to continually train in new information too but it's not like they can just remove the cutoff date.

kmod2y ago

Not disagreeing, but a fascinating thing they did (as a one-off fine-tune?) was teach ChatGPT about the openai python client library, including the features that were added after the cutoff date.

mustacheemperor2y ago

I enjoy using GPT4 as a co-programmer, and funny enough it is very challenging to get advice on Microsoft's own .NET MAUI because that framework was in prerelease at the time the model was trained.

My understanding is right now they essentially need to train a new model on a new updated corpus to fix this, but maybe some other techniques could be devised...or they'll train something more up to date.

ilaksh2y ago

You might actually get pretty far if you just went through the Microsoft docs and created a bunch of really concise examples and fed that as the start of the prompt. Use like 6-7kb for that and then the question at the end.

1 more reply

buildbot2y ago

Context drift! https://qntm.org/mmacevedo

r3trohack3r2y ago

Injecting the context yourself can help a lot. I frequently copy in a bunch of example code at the beginning of the conversation to help prime ChatGPT on APIs it knows nothing about.

For smaller projects that will fit, I've taken to: `xclip *` and then pasting the entire collection of files into ChatGPT before describing what I want to do.

adlpz2y ago

Keep in mind that GPT-4 has a max context size of ~8000 tokens, if I recall correctly. That means that in any given ChatGPT session the bot only remembers roughly the last ~6k words, as a trailing window. It'll forget the stuff at the beginning fast.

ilaksh2y ago

As stated your request is entirely impossible. They cannot simply "remove the cut-off date". It takes months and huge amounts of hardware to train. Then they do the reinforcement adjustments on top of it while researching how to train the next batch.

twobitshifter2y ago

Left the best part until the end. Scaling models larger is still paying off for openai. It’s not AGI yet, but how much bigger will a model need to get to max out?

>The scaling hypothesis is the idea that we may have most of the pieces in place needed to build AGI and that most of the remaining work will be taking existing methods and scaling them up to larger models and bigger datasets. If the era of scaling was over then we should probably expect AGI to be much further away. The fact the scaling laws continue to hold is strongly suggestive of shorter timelines.

tikwidd2y ago

After training my physics simulator on thousands of hours of video footage of trees moving in the wind, arborists tell me the trees are much more realistic (they are getting worried that I might put them out of business). But the physicists are still not satisfied. How many more videos do I need to generate the laws of motion?

bigyikes2y ago

Throw in the videos from the rest of the internet, and you might actually do it…

1 more reply

ilaksh2y ago

Why don't people ever explain what they mean by AGI? It means different things to different people.

jasmer2y ago

'It's not AGI yet' - the implication is insufferable. It's a language model that is incapable of any kind of reasoning, the talk of 'AGI' is a glib utopianism, a very heavy kind of koolaid. If we were to have referred to this tech as anything other than 'intelligence' - for example, if we chose 'adaptive algorithms' or 'weighted node storage' etc. we'd likely have a completely different popular mental model for it.

There will be no 'AI model' that is 'AGI', rather, a large swath of different technologies and models, operating together, will give the appearance of 'AGI' via some kind of interface.

It will not appear as an 'automaton' (aka single processing unit) and it certain will not be an 'aha moment'.

In 10 years, you'll be able to ask various agents, of different kinds, which will use varying kinds of AI to interpret speech, to infer context, which will interface with various AI APIs, in many ways it'll resemble what we have today but with more nuance.

The net appearance will evolve over time to appear a bit like 'AGI' but there won't be an 'entity' to identify as 'it'.

SanderNL2y ago

> incapable of any kind of reasoning

If this were true the debate would be a hell of lot easier. Unfortunately, it is not.

2 more replies

yesimahuman2y ago

The bit about plugins not having PMF is interesting and possibly flawed. I, like many others, got access to plugins but not the browsing or code interpreter plugins which feel like the bedrock plugins that make the whole offering useful. I think there's also just education that has to happen to teach users how to effectively use other plugins, and the UX isn't really there to help new users figure out what to even do with plugins.

CSMastermind2y ago

Have you found plugins to be useful?

For what it's worth I've found the model actually performs significantly worse at most tasks when given access to browsing, in part because it relies on that instead of its own in built knowledge.

I haven't found a good way to have it only access the web for specific parts of its response.

teetertater2y ago

The only plugin I found useful was the diagramming one, forgot what it's called. But you can quickly make code (or other) flowcharts etc. And browsing in rare cases.

1 more reply

verdverm2y ago

Most of the plugins are garbage and for those that aren't, most seem like they would be better as a chat like experience in the original app than the OpenAI app

furyofantares2y ago

PMF meaning "product market fit"? I had to look it up, curious if I found the right thing or not.

wsgeorge2y ago

Had the same reaction. I was just about Googling it when it hit. Funny how the brain can work out a random acronym given context.

typest2y ago

Yes, PMF = "product market fit".

1 more reply

gistbug2y ago

Yea, seems weird to allow people to use plugins, but not all of them. Then have the gall to say that no one is using plugins, yea because half of them don't have any context outside of America.

lumost2y ago

I tried the plugins - they honestly didn't seem to work very well. GPT-4 wasn't sure when it could use a plugin, or when it should talk about how it would do something. I wasn't able to get the plugins to activate most of the time.

m3kw92y ago

If you look at their API limits, no serious company can use this to scale up beyond say 10k users. 3500 Reqs per min for gpt3.5 turbo. They have a long way to go to make it usable for the rest of the 95%

thorax2y ago

I've had to move to using Azure OpenAI service during business hours for the API-- much more stable unless the prompts stray into something a little odd and their API censorship blocks the calls.

legendofbrando2y ago

I’ve been working directly with OpenAI’s access, are there any other advantages to doing this through Azure?

nonfamous2y ago

You can opt out of the safety filtering, btw.

cryptoz2y ago

Great content and great answers except for open source question. Sam is saying that he doesn’t think anyone would be able to run the code at scale so they didn’t bother? Seems like a nonsense answer, maybe I’m misunderstanding. The ability for individuals or businesses to effectively run and host the code shouldn’t have an impact on the ability to open source.

vagab0nd2y ago

https://archive.ph/rcbem

The page now says "This content has been removed at the request of OpenAI." I wonder why they did it.

bodecker2y ago

Related thread "OpenAI's plans according to Sam Altman removed at OpenAI's request": https://news.ycombinator.com/item?id=36177895

udev40962y ago

If they open source it, everyone would know that they used fuck ton of pirated content to train their models

andybak2y ago

As far as I'm aware training does not currently constitute "piracy".

It's fine to advocate for a redefinition but be explicit about it.

gnomewascool2y ago

I think the point here is about the procurement of the training data, in violation of copyright laws ("piracy"), rather than that the training itself is piracy.

The suspicion[0] is that OpenAI trained their models on a large text dump including libgen (in the so-called "books2").

If a person downloads a book from Library Genesis, they're a pirate; if OpenAI does it, so are they.

[0] https://twitter.com/theshawwn/status/1320282152689336320

sashank_15092y ago

Really great news to give at cheaper and faster GPT4. As a GPT+ subscriber, the most annoying thing is the 25 message limit every 3 hours, I really want that removed.

A bit sad to hear that the multimodal model will only come next year, was hoping to get it this year

100k to 1 Million context length, sounds phenomenal especially if it comes to GPT4. I've used Claudes 100k context length and I found it so useful that when I have large documents I just default to Claude now

alchemist1e92y ago

Did you have any tips on how you got access to Claude? I do the request access but never get any email or any contact.

floydnoel2y ago

I use Poe and got access to Claude 100k as soon as it was released. I think it's a better deal than paying OpenAI for sure, since you have access to GPT-4, Claude+, and others. They also have community bots, etc.

sashank_15092y ago

I'm a graduate student doing AI research in a US University. And I applied pretty early (Last year December I think) , those might be two factors that got me access to Claude.

I think getting access to Claude through slack is much easier and I recently got it by just downloading it as a Slack App

teetertater2y ago

Claude has a free slack client that I briefly was able to access by creating a new slack workspace and adding it there. But as of yesterday it wasn't working for me

refulgentis2y ago

Poe, I’m in the same boat btw

jamesfisher2y ago

https://archive.ph/uwaCp (original page is now a 404)

cwkoss2y ago

I love the tongue-in-cheek paradox myth that the Bitcoin whitepaper was written by a future god-AI to increase demand for GPUs (and thus boost supply) so we are able to assemble the future god-AI.

tivert2y ago

> I love the tongue-in-cheek paradox myth that the Bitcoin whitepaper was written by a future god-AI to increase demand for GPUs (and thus boost supply) so we are able to assemble the future god-AI.

I know it's a joke, but the hole is the god-AI couldn't have been that smart, since cryptocurrency-mining quickly switched to ASICs, which muting the demand increase for GPUs.

modernpink2y ago

Well, humans switched from using using brains to store all their memories once they could dump data onto external media via writing. Much like how crypto switching to ASICs frees up GPU capacity for AGI, writing freed the brain to develop higher GI.

anamexis2y ago

But not before ramping up development and production of GPUs.

2 more replies

cwkoss2y ago

I think there are some derivative coins that extended the viability of GPU mining, but I've been out of the game for a decade.

teleforce2y ago

Nah, many of the technical people who knew Hal Finney personally were claiming that the Caltech alumnus did write the original Bitcoin whitepaper not any random guys on the Internet [1].

The talk/conversation appeared to me not as OpenAI future plan but more on the CEO lamenting on how severely limited the company by the GPU or the lack thereof. It just a cheeky ploy by a CEO of an AI company that currently at 30B USD valuation to get more money in order to buy several [fill in the blank] of these most advanced GPU systems [2].

[1]Nakamoto's Neighbor: My Hunt For Bitcoin's Creator Led To A Paralyzed Crypto Genius:

https://www.forbes.com/sites/andygreenberg/2014/03/25/satosh...

[2]Nvidia DGX GH200: 100 Terabyte GPU Memory System:

https://news.ycombinator.com/item?id=36133226

cwkoss2y ago

Fun to imagine a time machine being built but the only thing it can transmit backward in time is PDFs

babyshake2y ago

I wouldn't install Adobe Acrobat even if it gave me access to break the laws of spacetime.

barbazoo2y ago

I'd read that book!

majormajor2y ago

Hyperion/The Fall of Hyperion by Dan Simmons has something similar.

baq2y ago

Watch Tenet.

1 more reply

renonce2y ago

Most Bitcoin miners have been using ASIC chips that compute nothing but Bitcoin format SHA256 since long ago so it’s not increasing demand for GPUs. Ethereum is but it has already switched to PoS.

mkoubaa2y ago

Then that god AI must have also pulled some strings for early video games

thelittleone2y ago

Conceptually this is paradoxical because of the notion that time is linear. Which it is to the best of our current understanding.

atemerev2y ago

All AI companies (OpenAI included) are now working full tilt on making AIs improve themselves (writing their own code, inventing new pipelines etc). I don't know why choose anything else to work on. This is a prime directive, that will bring the greatest payoff.

huijzer2y ago

I disagree since GPUs are a major constraint currently and that skilled specialists outperform GPT-4 almost always as long as they stay in their domain.

Will they use copilot(s) to improve the models? Yes, but they have been doing that since 2021 already (the release year of GitHub Copilot).

ren_engineer2y ago

if this was currently possible wouldn't it lead to sentient/superhuman AI rapidly?

>tell AI to make itself more efficient by finding performance improvements in human written code

>that newly available processing power can now be used to find more ways to improve itself

>flywheel effect of AI improving itself as it gets smarter and smarter

eventually you'd turn it loose on improving the actual hardware it runs on. I think the question now is really how far transformers can be taken and if they are really the path to "real" AI.

ilaksh2y ago

Within a couple of years of improvement processes like you suggest will actually be really dangerous and stupid.

Also don't confuse all other types of human/animal characteristics like sentience with intelligence. They are different things. Things like sentience, subjective stream of experience, or other aspects of being alive don't just accidentally fall out of larger training datasets.

And we should be glad. The models are going to be orders of magnitude faster (and perhaps X times higher IQ) than humans within a few years. It is incredibly foolish to try to make something like that into a living creature (or emulation of living).

2 more replies

m3kw92y ago

I think they are at least 1-2 new big research breakthroughs(on the level of Attention) away from having this.

ShamelessC2y ago

Well that is demonstrably untrue.

simse2y ago

> A stateful API

This would be huge for many applications, as "chatting" with GPT-4 gets really, really expensive very quickly. I've played with API with friends, and winced as I watched my usage hit several dollars for just a bit of fun.

naillo2y ago

> Plugins “don’t have PMF”

Probability mass functions? Anyone know what this means in this context?

simonbutt2y ago

Product market fit

bobbyi2y ago

The roadmap here is completely focused on ChatGPT and GPT-4. I wonder what portion of their resources is still going to other areas (DALL-E, audio/ video processing, etc.)

cmelbye2y ago

Maybe some of those things that are currently separate projects will eventually converge with a multimodal model.

artichokeheart2y ago

Off-topic note Humanloop might want to redesign their logo. It's been the Australian Broadcasting Corporation logo since 1963. Maybe pick a different Lissajous curve.

BeenAGoodUser2y ago

Nice to see they are working on reducing the pricing. GPT-4 is just too expensive right now imo. A long conversation would quickly end up costing tens of dollars if not more, so less expensive model costs + stateful API is urgently needed. I think even OpenAI will actually gain a lot by reducing the pricing, right now I wouldn't be surprised if many uses of GPT-4 weren't viable just because of the costs.

Terretta2y ago

This is off by probably x10 or more.

Dozens of people using it daily for coding and conversations and review in a month might be a couple hundred bucks. All day convo, constantly, as fast as it can respond, might add up to $5.

Not sure what kind of convo you're having that you could hit $10 unless you're parallelizing with something like the "guidance" tool or langchain.

jiggawatts2y ago

The version of GPT 4 with 32K token context length is the enabler for a huge range of "killer apps", but is even more expensive than the 8K version.

And yes, parallelism and loops are also key enablers for advanced use-cases.

For example, I have a lot of legacy code that needs uplifting. I'd love to be able to run different prompts over reams of code in parallel, iterating the prompts, etc...

The point of these things is that they're like humans you can clone at will.

The ability to point thousands of these things at a code base could be mindblowing.

refulgentis2y ago

Absolutely not. Dinner just got here, but tl;dr gpt4 is 0.03 per 750 words in 0.06 per 750 words out. People except the history to be included as well

dbmnt2y ago

Would someone please explain like I'm five which components of LLM's like ChatGPT are still closed source? What are the specific technologies that OpenAI is holding on to?

I know a lot of LLM stuff has either been released or leaked out, but don't have enough expertise in this area to understand the competitive advantages or breakthroughs OpenAI has obtained.

pnt122y ago

As far as I understand, it's mostly the weights. If you only have the models, you're still gonna need to get a massive amount of training material, huge costs for training it and fine tuning the hyper params until it works.

hoschicz2y ago

- they are working on a stateful API - they are working on a cheaper version of GPT-4

Most probably this is driven by their use of it in ChatGPT, which is on fire from PMF. Clearly they're experimenting with the cheaper GPT-4 in ChatGPT right now as it's fairly turbo now, as discussed earlier today.

heyzk2y ago

Great writeup, this helps us understand where to spend our time vs what OpenAI's progress will solve.

flakiness2y ago

If they open up fine-tuning API for their latest models, I wonder how the enthusiasm around the open source model is impacted. One of the advantages of the open source models is the ability to be fine-tuned. Are other benefits enough to keep the momentum going?

m3kw92y ago

You better have deep pockets, have you check the prices and then the rates for using the tuned models? They sure 10x to 100x more expensive then nontuned models

braindead_in2y ago

I am looking forward to faster GPT-4, larger context windows and finetuning APIs. The combination of these can solve most of problems that I currently face with my LLM apps. It looks like a good roadmap for 2023.

pavelstoev2y ago

This content has been removed at the request of OpenAI.

Just when I went back to the post for some quote material...

webmaven2y ago

"This content has been removed at the request of OpenAI."

throwaway17772y ago

Surprised no mention of them developing their own chips.

vb-84482y ago

7. find a sustainable business model and make some money

ftxbro2y ago

why should I believe what someone says their plans are

boringuser22y ago

>Dedicated capacity offering is limited by GPU availability. OpenAI also offers dedicated capacity, which provides customers with a private copy of the model. To access this service, customers must be willing to commit to a $100k spend upfront.

How many shell corporations are intelligence agencies seeding right now?

cwkoss2y ago

Last night I was musing how many different countries' intelligence agencies have moles working at OpenAI currently. Gotta be at least 6, maybe as high as two dozen?

CSMastermind2y ago

US, France, Israel ... then who? Maybe another five eyes country like the UK? Possibly China? I'm pretty skeptical Russia would be able to get someone in there but maybe.

2 more replies

m3kw92y ago

I bet the NSA has dossier on every employee there as well

1 more reply

boringuser22y ago

Agent Lee Chen Huwang, reporting for duty.

ftxbro2y ago

I had been putting theories in comments but they kept getting flagged or banned or downvoted to oblivion, but maybe its time has come. I'll keep it tame. If you are curious you can google connections of OpenAI board of directors, Will Hurd, In-Q-Tel trustees, Allen and Company, etc. There is more but whatever. The conspiracy theory is that 'the govt stepped in' during the six month pause after gpt-4 was trained and before it was released.

refulgentis2y ago

It probably keeps getting flagged because it’s ahistorical, source: OpenAI engineers, and #2 somewhat obviously so. You heard of RLHF?

1 more reply

MacsHeadroom2y ago

Private instance means a dedicated endpoint fully managed by OpenAI. You do not get model access or anything a regular API user doesn't already get, except your API url will be something like customer123.openai.com/api instead of api.openai.com/api

m3kw92y ago

They are not gonna give the weights for sure but it still will be inferencable, I’m not sure how but it’s be self destructive if they did

boringuser22y ago

Exactly, with a private model you could easily extract the weights.

ryanmercer2y ago

Why bother with shell corps when they already back companies in the clear: look at In-Q-Tel.

dontupvoteme2y ago

Having been recently taken aboard by the mothership I expect they'll start trying to tune out anything related to programming to push people towards co-pilot X..

It's pretty hilarious and annoying to see bing start to write code only to self censor itself after a few lines (deleting what was there! no wonder these guys love websockets and dynamic histories)

Whoops!

_boffin_2y ago

Wait… what? Can you elaborate.

mistymountains2y ago

He’s speculating that Microsoft is nerfing OpenAI / chatGPT to funnel narrow capabilities to silos like CoPilot.

1 more reply

dontupvoteme2y ago

https://en.wikipedia.org/wiki/Embrace,_extend,_and_extinguis...

1 more reply

jasmer2y ago

It's absurd that people are still thinking that a language model which a bunch of tokens are indexed is some kind of 'AGI'.

j / k navigate · click thread line to collapse

258 comments

sharkjacobs2y ago

Am I reading this right? "We're not open sourcing GPT-3 because we don't think it would be useful to anyone else"

TapWaterBandit2y ago

When you stop listening to what Sam Altman says and just focus on what he does, you can see the guy is a bit of a snake. Greedy power-hungry man imho.

rafark2y ago

I’ve tried very hard to like him because like it or not, ChatGPT has revolutionized the AI industry but he’s so hypocritical I just can’t stand him.

1 more reply

seattle_spring2y ago

I know it doesn't seem related on the surface, but I've found all startup CEOs who ban remote work to be snake-like and dishonest. Interesting coincidence at least.

2 more replies

courseofaction2y ago

I can't judge the individual, but his words do not align with the company's actions in the slightest.

>> 4. OpenAI will avoid competing with their customers — other than with ChatGPT

On this I would not bet a dime.

1 more reply

neximo642y ago

In your world how would you consider Sam Altman having no equity in OpenAI? And everyone finding out after it had a viral hit

2 more replies

mellosouls2y ago

Sam Altman is responsible for leading the team that have revolutionised AI in its position within society.

There is plenty to criticise OpenAI for but what he and they have achieved is extraordinary, and there is no need for that sort of toxic personal attack.

4 more replies

bane2y ago

Most of the companies I work with are actively putting in place policies to prevent employees from using OpenAI's service because nobody wants to send their proprietary IP to them.

Except what's happening is everybody is looking at buying the box from Nvidia, and sticking a large actually open model on it and simply ignoring OpenAI.

f6v2y ago

> Almost all of these companies have the technical ability, desire, and means to self-host for their employee community.

1 more reply

candiddevmike2y ago

OpenAI: Regulations must be passed to protect our moat

Also OpenAI: Meta is pissing in our moat, let's drop a hint about open sourcing our shit too!

razcleOP2y ago

roganartu2y ago

choppaface2y ago

Here is how hard it is to serve and use LLMs: https://github.com/ggerganov/llama.cpp

1 more reply

solarkraft2y ago

You're saying the same thing.

"I'm not sharing my chocolate with you because you probably wouldn't like it"

antupis2y ago

If it goes same way as other open sourced models it takes about 5 days that someone will get it running at m1.

rushingcreek2y ago

If he says he's inclined to open-source GPT-3, I don't see any good arguments not in favor of giving startups the choice of how they can run inference.

paxys2y ago

More like – it won't be useful to small-time developers (since they won't have the capability to host and run it themselves) and so all the benefits will be reaped by AWS and other large players.

sheepscreek2y ago

That said, I can imagine a GPTQ/4-bit quantized model to be smaller and easier to run on somewhat commodity clusters?

Or it could run with GGML/llama.cpp on a cloud instance with a TB of RAM?

After seeing what people were able to do with LLaMA, I am positive that the community will find a way to run it - albeit with some loss in performance.

It would be truly amazing if they used their computing to develop quantized models as well.

renonce2y ago

lostmsu2y ago

If you really need, 170B parameter model can infer a few tokens per minute on commodity hardware.

sebzim45002y ago

It is weird, but GPT-3 is worse than much smaller LLaMA models so I doubt it would see much use anyway.

killjoywashere2y ago

How do you measure this? Pointers to papers would be very helpful

1 more reply

flangola72y ago

Are you referring to DaVinci or ChatGPT-3.5

1 more reply

braindead_in2y ago

It is a shame that Sama does not believe in Open Source. The community can solve their GPU bottleneck issue by making it run on CPUs and edge devices in a matter of days.

barbariangrunge2y ago

If small organizations and teams can’t use it, then open sourcing it mostly just benefits big tech

That’s not ideal

How does open source licensing work with respect to trained ai models anyway? Is something like the mit license even that valuable here? Or is it?

sokoloff2y ago

Am I somehow being protected by a benevolent sama not open-sourcing the model?

bibanez2y ago

I agree, this is so bizarre

ftxbro2y ago

yes i also can't wrap my head around how a ceo of a billion dollar company isn't sincere in his public statements

1 more reply

RosanaAnaDana2y ago

Its just a way to lie that doesn't sound as much like a lie.

greenie_beans2y ago

lmao i had the same reaction. sounds like some bullshit.

stavros2y ago

Reads to me like "we don't know how many people will have hardware powerful enough to run this".

ryanmercer2y ago

Exactly. If you make it open source, great, cool, but only well-funded entities - like massive corporations - can even afford the hardware costs.

1 more reply

xxprogamerxy2y ago

He wants the release of the model to primarily benefit individuals and smaller teams as opposed to large deep-pocketed firms.

rurp2y ago

And he'll do that by... keeping ChatGPT models away from individuals and small teams and in the hands of a few large deep-pocketed firms?

TigeriusKirk2y ago

How can you sign a statement that AI presents an extinction risk on par with nuclear weapons and then even consider open sourcing your research?

We don't provide nuclear weapons for everyone to keep in their basement, why would someone who believes AI is an existential risk provide their code?

purplecats2y ago

this certainly aligns with the massive (albeit subjective and anecdotal) degradation in quality i've experienced with ChatGPT GPT-4 over the past few weeks.

hopefully a superior (higher quality) alternative surfaces before its unusable. i'm not considering continuing my subscription at this rate.

brucethemoose22y ago

Anthropic's Claude is said to be very good.

Instruction tuned LLaMA 65B/Falcon 40B are good, especially with an embeddings database.

...But OpenAI has all the name recognition and ease of use now, so it might not even matter if others ambiguously surpass OpenAI models.

legendofbrando2y ago

The problem with Claude is that it is quite literally impossible to get off the waiting list to use it. To OpenAI’s credit they actually ship the product in an accessible way to developers.

2 more replies

timeserious2y ago

Can you ELI5 why an embeddings database helps here? Can pinecone/milvus be used to 'extend memory' of OSS and vendor LLMs without retraining?

2 more replies

poulpy1232y ago

I don't have access to GPT-4 but claude is competiting with gpt-3.5 (chatgpt) and bingAI (whatever they use)

theonlybutlet2y ago

laratiede2y ago

People state this as if it is fact when there is no good way to measure this.

I have had random runs of good days and bad days since starting to use chatGPT.

rafark2y ago

Exactly. It’s become ‘cheap’. This is why we need more good competition.

nonethewiser2y ago

I wonder of it actually is because they’re tuning it to make it less offensive (by their standards). Thats the only explanation I keep seeing repeated.

reaperman2y ago

littlestymaar2y ago

ChatGTP2y ago

Should "intelligence" have ever costed anything?

It's like saying "air should cost money".

Sharlin2y ago

This just in: smart people should work for free.

1 more reply

londons_explore2y ago

> is limited by GPU availability.

Which is all the more curious, considering OpenAI said this only in January:

> Azure will remain the exclusive cloud provider for all OpenAI workloads across our research, API and products [1]

Did someone screw up by not putting a clause in that contract saying "exclusive cloud provider, unless you cannot fulfil our requests"?

[1]: https://openai.com/blog/openai-and-microsoft-extend-partners...

HarHarVeryFunny2y ago

There's an interesting recent video here from Microsoft discussing Azure. The format is a bit cheesy, but lots of interesting information nonetheless.

https://www.youtube.com/watch?v=Rk3nTUfRZmo&t=5s "What runs ChatGPT? Inside Microsoft's AI supercomputer"

ma2rten2y ago

That is only relevant for serving and not for inference, unless the model is too big to fit on a single host (typically 8 GPUs).

jiggawatts2y ago

Don't assume Microsoft is bad at everything and that AWS is automatically superior at all product categories...

JeremyNT2y ago

Whether MS is good or not isn't really the point. If they're constrained by GPU availability, being locked in to any specific provider is going to be a problem.

renonce2y ago

sebzim45002y ago

> Did someone screw up by not putting a clause in that contract saying "exclusive cloud provider, unless you cannot fulfil our requests"?

Maybe MSFT refused to sign such an agreement?

londons_explore2y ago

Perhaps they are cash flow constrained, which in turn means they are GPU constrained, since GPU's are their biggest expense?

dbmnt2y ago

ilaksh2y ago

AWS might not really have much extra GPU capacity for them anyway.. also they would cost more.

I think that there aren't a lot of GPUs available and it takes time to add more to the datacenter even when you do get them.

carom2y ago

I heard earlier this year that people were having trouble getting allocations on GCP as well. Probably why Nvidia is at $1T now.

doctor_eval2y ago

Let’s not forget that Microsoft is a big investor in OpenAI. It is important to know on which side your bread is buttered.

chaostheory2y ago

Even if they weren’t exclusive with Azure, aren’t GPU prices reasonable again?

verdverm2y ago

They have to be a available to buy, regardless the price. My understanding is there is a distinct lack of supply

1 more reply

catchnear43212y ago

this has nothing to do with sama clamoring for regulation.

that absolutely isn’t an attempt to slow down all competition.

which isn’t necessary because nobody made such a mistake.

this won’t lead to any hasty or reckless internal decisions in a feckless effort to stay in front.

not that any have already been made.

not that that could lead to disaster.

Imnimo2y ago

askkk2y ago

sanxiyn2y ago

Someone did that calculation and the result is here: https://www.reddit.com/r/slatestarcodex/comments/13u40yf/

100x GPT-4 to 85%.

Imnimo2y ago

And, if I'm reading their calculation right, that's 85% on the medium-difficulty bucket, not even the entire HumanEval benchmark?

(quoting from the GPT-4 paper):

>All but the 15 hardest HumanEval problems were split into 6 difficulty buckets based on the performance of smaller models. The results on the 3rd easiest bucket are shown in Figure 2

1 more reply

hervature2y ago

[1] - https://arxiv.org/pdf/2302.10866.pdf

arugulum2y ago

As someone who is in the field: papers proposing to solve the context length problem come out every month. Almost none of the solutions stick or work as well as a dense or mostly dense model.

dr_dshiv2y ago

What about Meta’s megabyte? Also nice proposal?

1 more reply

dr_dshiv2y ago

Can anyone elaborate on this? This is a big issue for me.

jiggawatts2y ago

Is this guy Aes Sedai?

Technically he can claim that OpenAI will not release competing products while Microsoft plugs AI into everything.

Very soon now, everything Microsoft sells will have OpenAI integration.

Unless you're selling a niche product too small for Microsoft to bother with, you're competing directly against OpenAI.

"Please Sir Sam, may I have some GPT please?"

"No."

edanm2y ago

> Is this guy Aes Sedai?

Haha having just finished the Wheel of Time, I'm super tickled by this reference.

It doesn't seem to be too common, only two uses of it on HN in the past year (at least, found by searching for the phrase "Aes Sedai")

1 more reply

ilaksh2y ago

I think the tricky part for me is that "work" is extremely broad and now that ChatGPT has plugins, it can kind of do anything. Heh.

jeffybefffy5192y ago

Title needs an update as Sama is also the name of the company which helped classify training data for ChatGPT: https://time.com/6247678/openai-chatgpt-kenya-workers/

pindab0ter2y ago

I agree. Not using the actual name is also gate keeping for anyone not familiar enough. The fact that “sama” isn’t even capitalised adds to this.

Sharlin2y ago

Never mind the fact that it’s against HN guidelines to modify original titles for no reason. Changing Sam Altman to sama is just ridiculous.

asnyder2y ago

ilaksh2y ago

He was talking about open sourcing GPT-3. That is not the frontier.

The frontier is the multimodal versions of GPT-4 which he just said wasn't even going to public release until next year. Or whatever they are on now which they are carefully not calling GPT-5.

muskmusk2y ago

I don't see the conflict. They see current models as mostly harmless, but what comes next is dangerous.

It sounds a little too much sci-fi for me, but I guess he knows better.

wintogreen742y ago

plus this conveniently pairs with "we don't need to regulate current models, but future models... oh boy do those need to be regulated!"

ftxbro2y ago

it's legal to make contradictory statements that's one of the job of a ceo and it's why they aren't usually overly literal types you know the kind i'm talking about

sovietmudkipz2y ago

I’m hoping GPT will remove the information cutoff date. I write plenty of terraform/AWS and it’s a bit of a pain that the latest API isn’t accessible by GPT yet.

There’s been quite a bit happening in the programming space since sept 2021.

I use GPT to keep things high level and then do my normal research methodology for implementation details.

furyofantares2y ago

kmod2y ago

Not disagreeing, but a fascinating thing they did (as a one-off fine-tune?) was teach ChatGPT about the openai python client library, including the features that were added after the cutoff date.

mustacheemperor2y ago

I enjoy using GPT4 as a co-programmer, and funny enough it is very challenging to get advice on Microsoft's own .NET MAUI because that framework was in prerelease at the time the model was trained.

ilaksh2y ago

1 more reply

buildbot2y ago

Context drift! https://qntm.org/mmacevedo

r3trohack3r2y ago

Injecting the context yourself can help a lot. I frequently copy in a bunch of example code at the beginning of the conversation to help prime ChatGPT on APIs it knows nothing about.

For smaller projects that will fit, I've taken to: `xclip *` and then pasting the entire collection of files into ChatGPT before describing what I want to do.

adlpz2y ago

ilaksh2y ago

twobitshifter2y ago

Left the best part until the end. Scaling models larger is still paying off for openai. It’s not AGI yet, but how much bigger will a model need to get to max out?

tikwidd2y ago

bigyikes2y ago

Throw in the videos from the rest of the internet, and you might actually do it…

1 more reply

ilaksh2y ago

Why don't people ever explain what they mean by AGI? It means different things to different people.

jasmer2y ago

There will be no 'AI model' that is 'AGI', rather, a large swath of different technologies and models, operating together, will give the appearance of 'AGI' via some kind of interface.

It will not appear as an 'automaton' (aka single processing unit) and it certain will not be an 'aha moment'.

The net appearance will evolve over time to appear a bit like 'AGI' but there won't be an 'entity' to identify as 'it'.

SanderNL2y ago

> incapable of any kind of reasoning

If this were true the debate would be a hell of lot easier. Unfortunately, it is not.

2 more replies

yesimahuman2y ago

CSMastermind2y ago

Have you found plugins to be useful?

For what it's worth I've found the model actually performs significantly worse at most tasks when given access to browsing, in part because it relies on that instead of its own in built knowledge.

I haven't found a good way to have it only access the web for specific parts of its response.

teetertater2y ago

The only plugin I found useful was the diagramming one, forgot what it's called. But you can quickly make code (or other) flowcharts etc. And browsing in rare cases.

1 more reply

verdverm2y ago

Most of the plugins are garbage and for those that aren't, most seem like they would be better as a chat like experience in the original app than the OpenAI app

furyofantares2y ago

PMF meaning "product market fit"? I had to look it up, curious if I found the right thing or not.

wsgeorge2y ago

Had the same reaction. I was just about Googling it when it hit. Funny how the brain can work out a random acronym given context.

typest2y ago

Yes, PMF = "product market fit".

1 more reply

gistbug2y ago

Yea, seems weird to allow people to use plugins, but not all of them. Then have the gall to say that no one is using plugins, yea because half of them don't have any context outside of America.

lumost2y ago

m3kw92y ago

thorax2y ago

I've had to move to using Azure OpenAI service during business hours for the API-- much more stable unless the prompts stray into something a little odd and their API censorship blocks the calls.

legendofbrando2y ago

I’ve been working directly with OpenAI’s access, are there any other advantages to doing this through Azure?

nonfamous2y ago

You can opt out of the safety filtering, btw.

cryptoz2y ago

vagab0nd2y ago

https://archive.ph/rcbem

The page now says "This content has been removed at the request of OpenAI." I wonder why they did it.

bodecker2y ago

Related thread "OpenAI's plans according to Sam Altman removed at OpenAI's request": https://news.ycombinator.com/item?id=36177895

udev40962y ago

If they open source it, everyone would know that they used fuck ton of pirated content to train their models

andybak2y ago

As far as I'm aware training does not currently constitute "piracy".

It's fine to advocate for a redefinition but be explicit about it.

gnomewascool2y ago

I think the point here is about the procurement of the training data, in violation of copyright laws ("piracy"), rather than that the training itself is piracy.

The suspicion[0] is that OpenAI trained their models on a large text dump including libgen (in the so-called "books2").

If a person downloads a book from Library Genesis, they're a pirate; if OpenAI does it, so are they.

[0] https://twitter.com/theshawwn/status/1320282152689336320

sashank_15092y ago

Really great news to give at cheaper and faster GPT4. As a GPT+ subscriber, the most annoying thing is the 25 message limit every 3 hours, I really want that removed.

A bit sad to hear that the multimodal model will only come next year, was hoping to get it this year

alchemist1e92y ago

Did you have any tips on how you got access to Claude? I do the request access but never get any email or any contact.

floydnoel2y ago

sashank_15092y ago

I'm a graduate student doing AI research in a US University. And I applied pretty early (Last year December I think) , those might be two factors that got me access to Claude.

I think getting access to Claude through slack is much easier and I recently got it by just downloading it as a Slack App

teetertater2y ago

Claude has a free slack client that I briefly was able to access by creating a new slack workspace and adding it there. But as of yesterday it wasn't working for me

refulgentis2y ago

Poe, I’m in the same boat btw

jamesfisher2y ago

https://archive.ph/uwaCp (original page is now a 404)

cwkoss2y ago

I love the tongue-in-cheek paradox myth that the Bitcoin whitepaper was written by a future god-AI to increase demand for GPUs (and thus boost supply) so we are able to assemble the future god-AI.

tivert2y ago

> I love the tongue-in-cheek paradox myth that the Bitcoin whitepaper was written by a future god-AI to increase demand for GPUs (and thus boost supply) so we are able to assemble the future god-AI.

I know it's a joke, but the hole is the god-AI couldn't have been that smart, since cryptocurrency-mining quickly switched to ASICs, which muting the demand increase for GPUs.

modernpink2y ago

anamexis2y ago

But not before ramping up development and production of GPUs.

2 more replies

cwkoss2y ago

I think there are some derivative coins that extended the viability of GPU mining, but I've been out of the game for a decade.

teleforce2y ago

Nah, many of the technical people who knew Hal Finney personally were claiming that the Caltech alumnus did write the original Bitcoin whitepaper not any random guys on the Internet [1].

[1]Nakamoto's Neighbor: My Hunt For Bitcoin's Creator Led To A Paralyzed Crypto Genius:

https://www.forbes.com/sites/andygreenberg/2014/03/25/satosh...

[2]Nvidia DGX GH200: 100 Terabyte GPU Memory System:

https://news.ycombinator.com/item?id=36133226

cwkoss2y ago

Fun to imagine a time machine being built but the only thing it can transmit backward in time is PDFs

babyshake2y ago

I wouldn't install Adobe Acrobat even if it gave me access to break the laws of spacetime.

barbazoo2y ago

I'd read that book!

majormajor2y ago

Hyperion/The Fall of Hyperion by Dan Simmons has something similar.

baq2y ago

Watch Tenet.

1 more reply

renonce2y ago

Most Bitcoin miners have been using ASIC chips that compute nothing but Bitcoin format SHA256 since long ago so it’s not increasing demand for GPUs. Ethereum is but it has already switched to PoS.

mkoubaa2y ago

Then that god AI must have also pulled some strings for early video games

thelittleone2y ago

Conceptually this is paradoxical because of the notion that time is linear. Which it is to the best of our current understanding.

atemerev2y ago

huijzer2y ago

I disagree since GPUs are a major constraint currently and that skilled specialists outperform GPT-4 almost always as long as they stay in their domain.

Will they use copilot(s) to improve the models? Yes, but they have been doing that since 2021 already (the release year of GitHub Copilot).

ren_engineer2y ago

if this was currently possible wouldn't it lead to sentient/superhuman AI rapidly?

>tell AI to make itself more efficient by finding performance improvements in human written code

>that newly available processing power can now be used to find more ways to improve itself

>flywheel effect of AI improving itself as it gets smarter and smarter

eventually you'd turn it loose on improving the actual hardware it runs on. I think the question now is really how far transformers can be taken and if they are really the path to "real" AI.

ilaksh2y ago

Within a couple of years of improvement processes like you suggest will actually be really dangerous and stupid.

2 more replies

m3kw92y ago

I think they are at least 1-2 new big research breakthroughs(on the level of Attention) away from having this.

ShamelessC2y ago

Well that is demonstrably untrue.

simse2y ago

> A stateful API

naillo2y ago

> Plugins “don’t have PMF”

Probability mass functions? Anyone know what this means in this context?

simonbutt2y ago

Product market fit

bobbyi2y ago

The roadmap here is completely focused on ChatGPT and GPT-4. I wonder what portion of their resources is still going to other areas (DALL-E, audio/ video processing, etc.)

cmelbye2y ago

Maybe some of those things that are currently separate projects will eventually converge with a multimodal model.

artichokeheart2y ago

Off-topic note Humanloop might want to redesign their logo. It's been the Australian Broadcasting Corporation logo since 1963. Maybe pick a different Lissajous curve.

BeenAGoodUser2y ago

Terretta2y ago

This is off by probably x10 or more.

Dozens of people using it daily for coding and conversations and review in a month might be a couple hundred bucks. All day convo, constantly, as fast as it can respond, might add up to $5.

Not sure what kind of convo you're having that you could hit $10 unless you're parallelizing with something like the "guidance" tool or langchain.

jiggawatts2y ago

The version of GPT 4 with 32K token context length is the enabler for a huge range of "killer apps", but is even more expensive than the 8K version.

And yes, parallelism and loops are also key enablers for advanced use-cases.

For example, I have a lot of legacy code that needs uplifting. I'd love to be able to run different prompts over reams of code in parallel, iterating the prompts, etc...

The point of these things is that they're like humans you can clone at will.

The ability to point thousands of these things at a code base could be mindblowing.

refulgentis2y ago

Absolutely not. Dinner just got here, but tl;dr gpt4 is 0.03 per 750 words in 0.06 per 750 words out. People except the history to be included as well

dbmnt2y ago

Would someone please explain like I'm five which components of LLM's like ChatGPT are still closed source? What are the specific technologies that OpenAI is holding on to?

I know a lot of LLM stuff has either been released or leaked out, but don't have enough expertise in this area to understand the competitive advantages or breakthroughs OpenAI has obtained.

pnt122y ago

hoschicz2y ago

- they are working on a stateful API - they are working on a cheaper version of GPT-4

heyzk2y ago

Great writeup, this helps us understand where to spend our time vs what OpenAI's progress will solve.

flakiness2y ago

m3kw92y ago

You better have deep pockets, have you check the prices and then the rates for using the tuned models? They sure 10x to 100x more expensive then nontuned models

braindead_in2y ago

pavelstoev2y ago

This content has been removed at the request of OpenAI.

Just when I went back to the post for some quote material...

webmaven2y ago

"This content has been removed at the request of OpenAI."

throwaway17772y ago

Surprised no mention of them developing their own chips.

vb-84482y ago

7. find a sustainable business model and make some money

ftxbro2y ago

why should I believe what someone says their plans are

boringuser22y ago

How many shell corporations are intelligence agencies seeding right now?

cwkoss2y ago

Last night I was musing how many different countries' intelligence agencies have moles working at OpenAI currently. Gotta be at least 6, maybe as high as two dozen?

CSMastermind2y ago

US, France, Israel ... then who? Maybe another five eyes country like the UK? Possibly China? I'm pretty skeptical Russia would be able to get someone in there but maybe.

2 more replies

m3kw92y ago

I bet the NSA has dossier on every employee there as well

1 more reply

boringuser22y ago

Agent Lee Chen Huwang, reporting for duty.

ftxbro2y ago

refulgentis2y ago

It probably keeps getting flagged because it’s ahistorical, source: OpenAI engineers, and #2 somewhat obviously so. You heard of RLHF?

1 more reply

MacsHeadroom2y ago

m3kw92y ago

They are not gonna give the weights for sure but it still will be inferencable, I’m not sure how but it’s be self destructive if they did

boringuser22y ago

Exactly, with a private model you could easily extract the weights.

ryanmercer2y ago

Why bother with shell corps when they already back companies in the clear: look at In-Q-Tel.

dontupvoteme2y ago

Having been recently taken aboard by the mothership I expect they'll start trying to tune out anything related to programming to push people towards co-pilot X..

It's pretty hilarious and annoying to see bing start to write code only to self censor itself after a few lines (deleting what was there! no wonder these guys love websockets and dynamic histories)

Whoops!

_boffin_2y ago

Wait… what? Can you elaborate.

mistymountains2y ago

He’s speculating that Microsoft is nerfing OpenAI / chatGPT to funnel narrow capabilities to silos like CoPilot.

1 more reply

dontupvoteme2y ago

https://en.wikipedia.org/wiki/Embrace,_extend,_and_extinguis...

1 more reply

jasmer2y ago

It's absurd that people are still thinking that a language model which a bunch of tokens are indexed is some kind of 'AGI'.

j / k navigate · click thread line to collapse