This turns out to be significant FLOPs and quality win, even counting for the initial model training and scoring part of it, they claim roughly 10x for quality/FLOP tradeoffs, and they show some significantly beating SOTA numbers for some tasks in their model size.
The bad part, to me, is that this is some significant engineering — it requires known high quality datasets, training of the scoring model, selection and scoring of the data for the big training run - this is not a bold new leap that’s going to be easy to implement for hobbyists - this is a practitioner’s excellent engineering showing the way forward for certain training needs.
As always, appreciate the publishing from DeepMind - this looks like great work. It would be nice to see a company like together.ai or others get it actionized into a pipeline; it might be a bit, though. It looks relatively gnarly in the details on the data and scoring side.
Deep Mind people invent transformers and then they watch people laugh at Bard or what it's called nowadays because product and engineering lost the plot. Kodak is paging you some message from the grave, read it Google.
The launch was faked and I don't think the real thing is here yet https://techcrunch.com/2023/12/07/googles-best-gemini-demo-w...
Total disaster. Doing similar tasks to openai and claude, it just borks. And it is complaining about my desire to use a gender guesser python libary, and tells me that's inappropriate for non-binary people, and it won't do it.
That's fun.
Edit 1: Also it refuses to print the entire script. I've tried many work arounds, it seems to only want to output a very small number of total lines.
Threw it into ChatGPT and immediately it fixed all the issues with Gemini, and worked on first try.
Edit 2: The only thing better about Gemini as far as I can tell, is that the copy code button is on the bottom. ChatGPT's is at the top, and that's dumb.
Edit 3: I'm being downvoted heavily now, to be clear, I didn't intentionally seek out the gender issue, it's just what I was working on.
I'm currently trying to generate infographics based on wrestlers, and I needed to split the men from the female for championship title rankings.
I have no problem with it in general, it just came up, so I communicated it.
Multiple times Gemini removed the code using the gender guesser library because it felt I shouldn't use it. When trying to determine wrestlers, and their Title Chances, it makes a lot of sense...
But Gemini just refused to allow me to use it, which seems like a ridiculous thing. I want to make the choices here.
https://arxiv.org/abs/1706.03762 https://arxiv.org/abs/1810.04805
And your statement of it is incorrect. It can result in greater demand, but it's not about resulting greater resource usage.
Some minority of efficiency improvements can sometimes lead to greater resource consumption, but overall efficiency does result in less resource usage.
This is just saying throughput is increased, yes? The time to train, and thus iterate (i.e. dialing in hyperpaprams) will decrease.
Ie: more efficient steam engines lead to both an increase of steam engine throughput as well as coal consumption, an increase in AI efficiency can lead to an increase in training throughput and energy consumption.
The paradox is a result of prevalence scaling faster than efficiency and efficiency driving prevalence.
Though even when you add the efficiency improvements I think we're still lagging behind Moore's Law overall.