In contrast to the main video, this video that is further down the page is really impressive and really does show - the 'which cup is the ball in is particularly cool': https://www.youtube.com/watch?v=UIZAiXYceBI.
Other key info: "Integrate Gemini models into your applications with Google AI Studio and Google Cloud Vertex AI. Available December 13th." (Unclear if all 3 models are available then, hopefully they are, and hopefully it's more like OpenAI with many people getting access, rather than Claude's API with few customers getting access)
They do make OpenAI look like kids in that regard. There is far more to technology than public facing goods/products.
It's probably in part due to the cultural differences between London/UK/Europe and SiliconValley/California/USA.
On one corner: IBM Deep Blue winning vs Kasparov. A world class giant with huge research experience.
On the other corner, Google, a feisty newcomer, 2 years in their life, leveraging the tech to actually make something practical.
Is Google the new IBM?
On the other hand, I think IBM’s problem is its finance focus and longterm decay of technical talent. It is well known for maintaining products for decades, but when’s the last time IBM came out with something really innovative? It touted Watson, but that was always more of a gimmick than an actually viable product.
Google has the resources and technical talent to compete with OpenAI. In fact, a lot of GPT is based on Google’s research. I think the main things that have held Google back are questions about how to monetize effectively, but it has little choice but to move forward now that OpenAI has thrown down the gauntlet.
Whereas for OpenAI there are no such constraints.
Did IBM have research with impressive web reverse indexing tech that they didn't want to push to market because it would hurt their other business lines? It's not impossible... It could be as innocuous as discouraging some research engineer from such a project to focus on something more in line.
This is why I believe businesses should be absolutely willing to disrupt themselves if they want to avoid going the way of Nokia. I believe Apple should make a standalone apple watch that cannibalizes their iPhone business instead of tying it to and trying to prop up their iPhone business (ofc shareholders won't like it). Whilst this looks good from Google - I think they are still sandbagging.. why can't I use Bard inside of their other products instead of the silly export thing.
apple is the new Nokia.
openai is the new google.
microsoft is the new apple.
Was it “machine learning”? If so, I don’t think that was actually the key insight for Google search… right? Did deep blue even machine learn?
Or was it something else?
It was a genius move to go public with a simple UI.
No matter how stunning the tech side is, if human interaction is not simple, the big stuff doesn’t even matter.
This statement is for the mass market MBA-types. More specifically, middle managers and dinosaur executives who barely comprehend what generative AI is, and value perceived stability and brand recognition over bleeding edge, for better or worse.
I think the sad truth is an enormous chunk of paying customers, at least for the "enterprise" accounts, will be generating marketing copy and similar "biz dev" use cases.
Nokia and Blackberry had far more phone-making experience than Apple when the iPhone launched.
But if you can't bring that experience to bear, allowing you to make a better product - then you don't have a better product.
I'm not dumb enough to bet against Google. They appear to be losing the race, but they can easily catch up to the lead pack.
There's a secondary issue that I don't like Google, and I want them to lose the race. So that will color my commentary and slow my early adoption of their new products, but unless everyone feels the same, it shouldn't have a meaningful effect on the outcome. Although I suppose they do need to clear a higher bar than some unknown AI startup. Expectations are understandably high - as Sundar says, they basically invented this stuff... so where's the payoff?
It makes Google look like old fart that wasted his life and didn't get anywhere and now he's bitter about kids running on his lawn.
https://www.hathitrust.org/ has that corpus, and its evolution, and you can propose to get access to it via collaborating supercomputer access. It grows very rapidly. InternetArchive would also like to chat I expect. I've also asked, and prompt manipulated chatGPT to estimate the total books it is trained with, it's a tiny fraction of the corpus, I wonder if it's the same with Google?
Whatever answer it gave you is not reliable.
2023-11-14: GraphCast, word leading weather prediction model, published in Science
2023-11-15: Student of Games: unified learning algorithm, major algorithmic breath-through, published in Science
2023-11-16: Music generation model, seemingly SOTA
2023-11-29: GNoME model for material discovery, published in Nature
2023-12-06: Gemini, the most advanced LLM according to own benchmarks
Where it has fallen down (compared to its relative performance in relevant research) is public generative AI products [0]. It is trying very hard to catch up at that, and its disadvantage isn't technological, but that doesn't mean it isn't real and durable.
[0] I say "generative AI" because AI is a big an amorphous space, and lots of Google's products have some form of AI that is behind important features, so I'm just talking about products where generative AI is the center of what the product offers, which have become a big deal recently and where Google had definitely been delivering far below its general AI research weight class so far.
Google is locked behind research bubbles, legal reviews and safety checks.
Mean while OpenAI is eating their lunch.
It will be interesting to see how this percolates through the existing systems.
Reminds me of the Stadia reveal, where the first words out of his mouth were along the lines of "I'll admit, I'm not much of a gamer"
This dude needs a new speech writer.
How about we go further and just state what everyone (other than Wall St) thinks: Google needs a new CEO.
One more interested in Google's supposed mission ("to organize the world's information and make it universally accessible and useful"), than in Google's stock price.
I've been making this exact comparison for years at this point.
Both inherited companies with market dominant core products in near monopoly positions. They both kept the lights on, but the companies under them repeatedly fail the break into new markets and suffer from a near total lack of coherent vision and perverse internal incentives that contribute to the failure of new products. And after a while, the quality of that core product starts to stumble as well.
The fact that we've seen this show before makes it all the more baffling to me that investors are happy about it. Especially when in the same timeframe we've seen Satya Nadella completely transform Microsoft and deliver relatively meteoric performance.
If only there was some technology that could help "generate" such text.
In my opinion, the best ones are:
* https://www.youtube.com/watch?v=UIZAiXYceBI - variety of video/sight capabilities
* https://www.youtube.com/watch?v=JPwU1FNhMOA - understanding direction of light and plants
* https://www.youtube.com/watch?v=D64QD7Swr3s - multimodal understanding of audio
* https://www.youtube.com/watch?v=v5tRc_5-8G4 - helping a user with complex requests and showing some of the 'thinking' it is doing about what context it does/doesn't have
* https://www.youtube.com/watch?v=sPiOP_CB54A - assessing the relevance of scientific papers and then extracting data from the papers
My current context: API user of OpenAI, regular user of ChatGPT Plus (GPT-4-Turbo, Dall E 3, and GPT-4V), occasional user of Claude Pro (much less since GPT-4-Turbo with longer context length), paying user of Midjourney.
Gemini Pro is available starting today in Bard. It's not clear to me how many of the super impressive results are from Ultra vs Pro.
Overall conclusion: Gemini Ultra looks very impressive. But - the timing is disappointing: Gemini Ultra looks like it won't be widely available until ~Feb/March 2024, or possibly later.
> As part of this process, we’ll make Gemini Ultra available to select customers, developers, partners and safety and responsibility experts for early experimentation and feedback before rolling it out to developers and enterprise customers early next year.
> Early next year, we’ll also launch Bard Advanced, a new, cutting-edge AI experience that gives you access to our best models and capabilities, starting with Gemini Ultra.
I hope that there will be a product available sooner than that without a crazy waitlist for both Bard Advanced, and Gemini Ultra API. Also fingers crossed that they have good data privacy for API usage, like OpenAI does (i.e. data isn't used to train their models when it's via API/playground requests).
See Table 2 and Table 7 https://storage.googleapis.com/deepmind-media/gemini/gemini_... (I think they're comparing against original GPT-4 rather than GPT-4-Turbo, but it's not entirely clear)
What they've released today: Gemini Pro is in Bard today. Gemini Pro will be coming to API soon (Dec 13?). Gemini Ultra will be available via Bard and API "early next year"
Therefore, as of Dec 6 2023:
SOTA API = GPT-4, still.
SOTA Chat assistant = ChatGPT Plus, still, for everything except video, where Bard has capabilities . ChatGPT plus is closely followed by Claude. (But, I tried asking Bard a question about a youtube video today, and it told me "I'm sorry, but I'm unable to access this YouTube content. This is possible for a number of reasons, but the most common are: the content isn't a valid YouTube link, potentially unsafe content, or the content does not have a captions file that I can read.")
SOTA API after Gemini Ultra is out in ~Q1 2024 = Gemini Ultra, if OpenAI/Anthropic haven't released a new model by then
SOTA Chat assistant after Bard Advanced is out in ~Q1 2024 = Bard Advanced, probably, assuming that OpenAI/Anthropic haven't released new models by then
I've never seen the entire sidebar filled with the videos of a single channel before.
Somebody please wake me up when I can talk to the thing by typing and dropping files into a chat box.
These lines are for the stakeholders as opposed to consumers. Large backers don't want to invest in a company that has to rush to the market to play catch-up, they want a company that can execute on long-term goals. Re-assuring them that this is a long-term goal is important for $GOOG.
Google's weakness is on the product side, their research arm puts out incredible stuff as other commenters have pointed out. GPT essentially came out from Google researchers that were impatient with Google's reluctance to ship a product that could jeopardize ad revenue on search.
Yes, I know it was a field of interest and research long before Google invested, but the fact remains that they _did_ invest deeply in it very early on for a very long time before we got to this point.
Their continued investment has helped push the industry forward, for better or worse. In light of this context, I'm ok with them taking a small victory lap and saying "we've been here, I told you it was important".
AI has been adding a huge proportion of the shareholder value at Google for many years. The fact that their inference systems are internal and not user products might have hidden this from you.
Actually, they kind of did. What's interesting is that they still only match GPT-4's version but don't propose any architectural breakthroughs. From an architectural standpoint, not much has changed since 2017. The 'breakthroughs', in terms of moving from GPT to GPT-4, included: adding more parameters (GPT-2/3/4), fine-tuning base models following instructions (RLHF), which is essentially structured training (GPT-3.5), and multi-modality, which involves using embeddings from different sources in the same latent space, along with some optimizations that allowed for faster inference and training. Increasing evidence suggests that AGI will not be attainable solely using LLMs/transformers/current architecture, as LLMs can't extrapolate beyond the patterns in their training data (according to a paper from DeepMind last month):
"Together our results highlight that the impressive ICL abilities of high-capacity sequence models may be more closely tied to the coverage of their pretraining data mixtures than inductive biases that create fundamental generalization capabilities."[1]
And how many financial people worth reconning with are under 30 years old? Not many.
Which is definitely where Google is in the generative AI space.
Sure Google paid em money/employed em, but the smarts behind it isn't the entity Google or the execs at the top, Sundar etc; it's those researchers. I like to appreciate individualism in a world where those at the top have lobbied their way into a 1% monopoly lmao.
Well in fairness he has a point, they are starting to look like a legacy tech company.
Sundar has been saying this repeatedly since Day 0 of the current AI wave. It's almost cliche for him at this point.
Or until Google gives up on the space, or he isn't CEO, if either of those come first, which I wouldn't rule out.
AlphaGo, AlphaFold, AlphaStar.
They were groundbreaking a long time ago. They just happened to miss the LLM surge.
It said rubber ducks float because they’re made of a material less dense than water — but that’s not true!
Rubber is more dense than water. The ducky floats because it’s filled with air. If you fill it with water it’ll sink.
Interestingly, ChatGPT 3.5 makes the same error, but GPT 4 nails it and explains the it’s the air that provides buoyancy.
I had the same impression with Google’s other AI demos: cute but missing something essential that GPT 4 has.
https://eu.usatoday.com/story/news/politics/elections/2023/1...
The look isn't good. But it's not dishonest.
(The context awareness of the current breed of generative AI seems to be exactly what TTS always lacks, awkward syllables and emphasis, pronunciation that would be correct sometimes but not after that word, etc.)
For example here's a paper 10 years old now: https://static.googleusercontent.com/media/research.google.c... and another close to 10 years old now: https://research.google/pubs/pub43146/ The learning they expose in those papers came from the previous 10 years of operating SmartASS.
However, SmartASS and sibyl weren't really what external ML people wanted- it was just fairly boring "increase watch time by identifying what videos people wioll click on" and "increase mobile app installs" or "show the ads people are likely to click on".
It really wasn't until vincent vanhoucke stuffed a bunch of GPUs into a desktop and demonstrated scalable and dean/ng built their cat detector NN that google started being really active in deep learning. That was around 2010-2012.
That was relevant given they were selling their models to law enforcement.
Completely! Just tried Bard. No images and the responses it gave me were pretty poor. Today's launch is a weak poor product launch, looks mostly like a push to close out stuff for Perf and before everybody leaves for the rest of the December for vacation.
He mentions Transformers - fine. Then he says that we've all been using Google AI for so long with Google Translate.
They showed AlphaGo, they showed Transformers.
Pretty good track record.
So it's either free-private-gpt3.5 or cloud-better-than-gpt4v. Nothing else matters now. I think we have reached an extreme point of temporal discounting (https://en.wikipedia.org/wiki/Time_preference).
I would argue Google has done almost nothing interesting since then (at least not things they haven't killed)
I think that was the point.
People speak of the uncanny valley in terms of appearance. I am getting this from Gemini. It’s sort of impressive but feels freaky at the same time.
Is it just me?
It is a great example of what I've been finding a growing concern as we double down on Goodhart's Law with the "beats 30 out of 32 tests compared to existing models."
My guess is those tests are very specific to evaluations of what we've historically imagined AI to be good at vs comprehensive tests of human ability and competencies.
So a broad general pretrained model might actually be great at sounding 'human' but not as good at logic puzzles, so you hit it with extensive fine tuning aimed at improving test scores on logic but no longer target "sounding human" and you end up with a model that is extremely good at what you targeted as measurements but sounds like a creepy toddler.
We really need to stop being so afraid of anthropomorphic evaluation of LLMs. Even if the underlying processes shouldn't be anthropomorphized, the expressed results really should be given the whole point was modeling and predicting anthropomorphic training data.
"Don't sound like a creepy soulless toddler and sound more like a fellow human" is a perfectly appropriate goal for an enterprise scale LLM, and we shouldn't be afraid of openly setting that as a goal.
Google DeepMind squandered their lead in AI so much that they now have to have “Google” prepended to their name to show that adults are now in charge.