Ignore the critics. Watch the demos. Play with it.
This stuff feels magical. Magical. It makes the movie "Her" look like it's no longer in the realm of science fiction but in the realm of incremental product development. HAL's unemotional monotone in Kubrick's movie, "Space Odyssey," feels... oddly primitive by comparison. I'm impressed at how well this works.
Well-deserved congratulations to everyone at OpenAI!
Because its capacities are focused on exactly the right place to feel magical. Which isn’t to say that there isn’t real utility, but language (written, and even moreso spoken) has an enormous emotional resonance for humans, so this is laser-targeted in an area where every advance is going to “feel magical” whether or not it moves the needle much on practical utility; it’s not unlike the effect of TV news making you feel informed, even though time spent watching it negatively correlates with understanding of current events.
I worry about the 'cheery intern' response becoming something of a punch line.
"Hey siri, launch the nuclear missiles to end the world."
"That's a GREAT idea, I'll get right on that! Is there anything else I can help you with?"
Kind of punch lines.
Will be interesting to see where that goes once you've got a good handle on capturing the part of speech that isn't "words" so much as it is inflection and delivery. I am interested in a speech model that can differentiate between "I would hate to have something happen to this store." as a compliment coming from a customer and as a threat coming from an extortionist.
But enough of that. The future looks bright. Everyone smile!
Or else..
"Guys, I am just pleased as punch to inform you that there are two thermo-nuclear missiles headed this way... if you don't mind, I'm gonna go ahead and take evasive action."
These things are amazing compared to old-school NLP: the step-change in capability is real.
But we should also keep our wits about us, they are well-Des robed by current or conjectural mathematics, they fail at things dolphins can do, it’s not some AI god and it’s not self-improving.
Let’s have balance on both the magic of the experience and getting past the tech demo stage: every magic trick has a pledge, but I think we’re still working on the prestige.
this focus subverts its intended effect on those of us with hair trigger bullshit-PTSD
Another step closer for those 7 trillion that OpenAI is so desperate for.
Edit: Apparently not based on your clarification, instead the researchers don't know any better than to march into a local maxima because they're only human and seek to replicate themselves. I assumed too much good faith.
(Arguably, it is the other way around: they aren’t focused on appealing to those biases, but driven by them, in the that the perception of language modeling as a road to real general reasoning is a manifestation of the same bias which makes language capacity be perceived as magical.)
Sound like the people who defend Astrology because it feels magical how their horoscope fits their personality.
"Don't bother me with facts that destroy my rose-tinted view"
At moment AI is a massive hype and shoved into everything. To point at the faults and weaknesses is a reasonable and responsible thing to do.
3 years ago, if you told me you could facetime with a robot, and they could describe the environment and have a "normal" conversation with me, i would be in disbelief, and assume that tech was a decade or two in the future. Even the stuff that was happening a 2 years ago felt unrealistic.
astrology is giving vague predictions like "you will be happy today". GPT-4o is describing to you actual events in real time.
"Rather than ship a product, companies can ship blueprints and everyone can just print stuff at their own home! Everything will be 3d printed! It's so magical!"
Just because a tech is magical today, doesn't mean that it will be meaningful tomorrow. Sure, 3d printing has its place (mostly in making plastic parts for things) but it's hardly the revolutionary change in consumer products that it was touted to be. Instead, it's just a hobbiest toy.
GPT-4o being able to describe actual events in real time is interesting, it's yet to be seen if that's useful.
That's mostly the thinking here. A lot of the "killer" AI tech has really boiled down to "Look, this can replace your customer support chat bot!". Everyone is rushing to try and figure out what we can use LLMs (Just like they did when ML was supposed to take over the world) and so far it's been niche locations to make shareholders happy.
So far the biggest usecase for LLMs is mass propaganda and scams. The fact that we might also get AI girlfriends out of the tech understandly doesn't seem that appealing to a lot of folks.
The first users of Eliza felt the same about the conversation with it.
The important point is to know that GPTs don't know or understand.
It may feel like a normal conversation but is a Chinese Room on steroids.
People started to ask GPTs questions and take the answers as facts because the believe it's intelligent.
Does it really or are you just playing facile word association games with the word "magical"?
AI has a great deal of substance. It can draft documents. It can identify foods in a picture and give me a recipe that uses them. It can create songs, images and video.
AI, of course, has a lot of flaws. It does some thing poorly, it does other things with bias, and it's not suitable for a huge number of use cases. To imply that something that has a great deal of substance but flaws alongside is the same as something that has no substance whatsoever nor ever will is just not a reasonable thing to do.
"AI is massive hype and shoved into everything" has more grounding as a negative feeling of people being overwhelmed with technology than any basis in fact. The faults and weaknesses are buoyed by people trying to acknowledge your feelings than any real criticism of a technology that is changing faster than the faults and weakness arguments can be made. Study machine learning and come back with an informed criticism.
Not having to write boilerplate code itself also is very handy.
So yes, I absolutely do want this "magic." "I don't like it so no one should use it" is a pretty narrow POV.
I’d strongly prefer that though, along with HAL’s reasoning abilities.
There wasn't any incentive to make it sound artificially emotional or emphatic beyond a "Sorry, Dave".
To use another pop-culture reference, Obi-Wan in Episode IV had deep empathy, but didn’t speak emotionally. Those are separate things.
A lot of terrible human behavior is driven by emotions. An emotionless machine will never dump you out the airlock in a fit of rage.
Have you seen the final scene of the movie Ex Machina? Without spoilers, I'll just say that acting like has emotions is much more different than actually having them. This is in fact what socio- and psychopaths are like, with stereotypical results.
With so many smoke and mirrors demos out there, I am not super excited at those videos. I would play with it, but it seems like it is not available in a free tier (I stopped paying OpenAI a while ago after realizing that open models are more than enough for me)
Don’t get me wrong, excited about this update, but I’m struggling to see what is so magical about it. Then again, I’ve been using GPT voice every day for months, so if you’re just blown away from talking to a computer then I get it
When GPT-2/3/3.5/4 came out, it was fairly easy to see the progression from reading model outputs that it was just getting better and better at text. Which was pretty amazing but in a very intellectual way, since reading is typically a very "intellectual" "front-brain" type of activity.
But this voice stuff really does make it much more emotional. I don't know about you, but the first time I used GPT's voice mode I notice that I felt something -- very un-intellectually, very un-cerebral -- like, the feeling that there is a spirit embodying the computer. Of course with LLM's there always is a spirit embodying the computer (or, there never is, depending on your philosophical beliefs).
The Suno demos that popped up recently should have clued us all in that this kind of emotional range was possible with these models. This announcement is not so much a step function in model capabilities, but it is a step function in HCI. People are just not used to their interactions with a computer be emotional like this. I'm excited and concerned in equal parts that many people won't be truly prepared for what is coming. It's on the horizon, having an AI companion, that really truly makes you feel things.
Us nerds who habitually read text have had that since roughly GPT-3, but now the door has been blown open.
Very excited about faster response times, auto interrupt, cheaper api, and voice api — but the “emotional range” is actually disappointing to me. hopefully it doesn’t impact the default experience too much, or the memory features get good enough that I can stop it from trying to pretend to be a human
Tone, Emphasis, Speed, Accent are all very important parts of how humans communicate verbally.
Before today, voice mode was strictly your audio>text then text>audio. All that information destroyed.
Now the same model takes in audio tokens and spits back out audio tokens directly.
Watch this demo, it's the best example of the kind of thing that would be flat out impossible with the previous setup.
https://www.youtube.com/live/DQacCB9tDaw?si=2LzQwlS8FHfot7Jy
I’m very excited about all these updates and it’s really cool tech, but all I’m seeing is quality of life improvements and some cool engineering.
That’s not necessarily a bad thing. Not everything has to be magic or revolutionary to be a cool update
on a tangent...
I find it interesting the psychology behind this. If the voice in 2001 had proper inflection, it wouldn't have been perceived as a computer.
(also, I remember when voice synthesizers got more sophisticated and Stephen Hawking decided to keep his original first-gen voice because he identified more with it)
I think we'll be going the other way soon. Perfect voices, with the perfect emotional inflection will be perceived as computers.
However I think at some point they may be anthropomorphized and given more credit than they deserve. This will probably be cleverly planned and a/b tested. And then that perfect voice, for you, will get you to give in.
2. Even then this is a wonderful step for tech in general and not just OpenAI. Makes me very excited.
3. Most economic value and growth driven by AI will not come from consumer apps but rather the enterprise use. I am interested in seeing how AI can automatically buy stuff for me, automate my home, reduce my energy used, automatically apply and get credit cards based on my purchases, find new jobs for me, negotiate with a car dealer on my behalf, detect when I am going to fall sick, better diabetes case and eventual cure etc. etc.
Are we supposed to cheer to that?
We're already mid way to the full implementation of 1984, do we need Her before we get to Matrix ?
Well that's exactly why I'm not looking forward to whatever is coming. The average joe thinking dating a server is not a dystopia frighten me much more than the delusional tech ceo who thinks his ai will revolutionise the world
> Things might even improve substantially if we all interact with personalities that are consistently positive and biased towards conflict resolution and non judgemental interactions.
Some kind of turbo bubble in which you don't even have to actually interact with anyone or anything ? Every "personalities" will be nice to you as long as you send $200 to openai every week, yep that's absolutely a dystopia for me
It really feels like the end goal is living in a pod and being uploaded in an alternative reality, everything we build to "enhance" our lives take us further from the basic building blocks that make life "life".
The future is indeed here... and it is, indeed, not equitably distributed.
The simplest example is “list all of the presidents in reverse chronological order of their ages when inaugurated”.
Both ChatGpt 3.5 and 4 get the order wrong. The difference is that I can instruct ChatGPT 4 to “use Python”
https://chat.openai.com/share/87e4d37c-ec5d-4cda-921c-b6a9c7...
You can do similar things to have it verify information by using internet sources and give you citations.
Just like with the Python example, at least I can look at the script/web citation myself
This question is probably not the simplest form of the query you intend to receive an answer for.
If you want a descending list of presidents based on their age at inauguration, I know what you want.
If you want a reverse chronological list of presidents, I know what you want.
When you combine/concatenate the two as you have above, I have no idea what you want, nor do I have any way of checking my work if I assume what you want. I know enough about word problems and how people ask questions to know that you probably have a fairly good idea what you want and likely don’t know how ambitious this question is as asked, and I think you and I both are approaching the question with reasonably good faith, so I think you’d understand or at least accommodate my request for clarification and refinement of the question so that it’s less ambiguous.
Can you think of a better way to ask the question?
Now that you’ve refined the question, do LLMs give you the answers you expect more frequently than before?
Do you think LLMs would be able to ask you for clarification in these terms? That capability to ask for clarification is probably going to be as important as other improvements to the LLM, for questions like these that have many possibly correct answers or different interpretations.
Does that make sense? What do you think?
I tried asking the question more clearly
I think it “understood” the question because it “knew” how to write the Python code to get the right answer. It parsed the question as expected
The previous link doesn’t show the Python. This one does.
https://chat.openai.com/share/a5e21a97-7206-4392-893c-55c531...
LLMs are generally not good at math. But in my experience ChatGPT is good at creating Python code to solve math problems
The last part of the movie "Her" is still in the realm of science fiction, if not outright fantasy. Reminds me of the later seasons of SG1 with all the talk of ascension and Ancients. Or Clarke's 3001 book intro, where the monolith creators figured out how to encode themselves into spacetime. There's nothing incremental about that.
In comparison to the gas pump which says "Thank You!"
If chatbots feel magical, what those people did will feel divinely inspired.
However, using ChatGPT with transcribing is already offering me similar experience, so what is new exactly
It’s not accessible to everyone yet.
Even on api, I can’t send it voice stream yet.
Api refuses to generate images.
Next few weeks will tell as more people play with it.
There’s so much helpful niche functionality that can be added to custom clients.
I’m not a sceptic and apply AI on a daily basis, but whole “we can finally replace people” vibe is extremely off-putting. I had very similar feelings during pandemic, when majority of people was so seemingly happy to drop any real human interaction in favor of remote comms via chats/audio calls, it still creeps me out how ready we are as a society to drop anything remotely human in favor of technocratic advancement and “productivity”.
On one hand, I agree - we shouldn't diminish the very real capabilities of these models with tech skepticism. On the other hand, I disagree - I believe this approach is unlikely to lead to human-level AGI.
Like so many things, the truth probably lies somewhere between the skeptical naysayers and the breathless fanboys.
You might not be fooled by a conversation with an agent like the one in the promo video, but you'd probably agree that somewhere around 80% of people could be. At what percentage would you say that it's good enough to be "human-level?"
They are referring to an AI that can use reasoning, deduction, logic, and abstraction like the smartest humans can, to discover, prove, and create novel things in every realm that humans can: math, physics, chemistry, biology, engineering, art, sociology, etc.
I think people will quickly learn with enough exposure, and then that percentage will go down.
The average human have tons of quirks, talk over each other all the time, generally can't solve complex problems in a casual conversion setting, and are not always cheery and ready to please like Scarlet's character in Her.
I think our expectations of AI is way too high from our exposure to science fiction.
Also, if this is your definition of magic then...yeah...
the interruptiopn part is just flow control at the edge. control-s, control-c stuff, right? not AI?
The sound of a female voice to an audience 85% composed of males between the ages of 14 and 55 is "magical", not this thing that recreates it.
so yeah, its flow control and compression of highly curated, subtle soft porn. Subtle, hyper targeted, subconscious porn honed by the most colossal digitally mediated focus group ever constructed to manipulate our (straight male) emotions.
why isn't the voice actually the voice of the pissed off high school janitor telling you to man-up and stop hyperventilating? instead its a woman stroking your ego and telling you to relax and take deep breaths. what dataset did they train that voice on anyway?
Most voice assistants have male options, and an increasing number (including ChatGPT) have gender neutral voices.
> why isn't the voice actually the voice of the pissed off high school janitor telling you to man-up and stop hyperventilating
sounds like a great way to create a product people will outright hate
This is like horseshoe theory on steroids.