Exciting times.
If you don't update your database and indices they are great. But that's something really tempting to do when you do some machine learning, (specially if you know that people with deeper pockets will do so).
Typically you will have a neural network, you run it on your dataset, it produces a new dataset of embeddings, you index them, and you use this index to train a new neural network, and you repeat the loop, hopefully improving results along the way.
NVMe SSD can write at 6GB/s but can only write ~800TB that's about 37 hours of lifetime at max speed.
"Only" 825 GB actually: https://pile.eleuther.ai/
A not-insignificant fraction of that is definitively copyrighted material, though, which raises some interesting questions when switching to a model of distributing "a smaller trained model plus the original raw training data" (though it seems that the team behind GPT-J are clearly happy to distribute their full set of data anyway, and seem to be enough under the radar to not attract the wrong sort of attention,at least for now).
48GB VRAM? 48+ gigabytes of system ram is cheap, 48 gigabytes of ram on a GPU is still painfully expensive.
Still superb, though, there's no reason you can't use other gofai tools vs a static database, to trigger expert systems or formalized reasoning.
Like, could we assume that for humans it's also faster to search for information on Wikipedia or would it be faster to recall from memory of already read Wikipedia? Although with humans stored information decay is present. (In a way a human form of garbage collection :P).
(This is a different blogpost, but does not seem to add over the original)
Edit: following derac's comment see https://news.ycombinator.com/item?id=29646112 for RETRO
[0] https://deepmind.com/research/publications/2021/improving-la...
Github Copilot is definitely GPT-3-based and is seeing real-world use https://copilot.github.com
Transformers are state of the art for many tasks so they are likely to be used for "intelligent" processing of text or speech data, but due to practical limitations you are probably interacting with them mostly through web services.
If anything, it's being used in force for social media marketing, where you're trying to say "buy this thing" in different ways every day.
Written Scots In the written mode, Scots spelling remains variable. Attempts to make it more consistent, notably the Scots Style Sheet produced by the Makars’ Club in 1947 or the Recommendations for Writers in Scots published by the Scots Language Society in 1985, have had at best only limited success, competing with other systems that have been developed to represent more closely localized varieties of spoken Scots.
When your reference text says the language isn't yet well captured in a single print, you better believe the wiki page is a hot mess.
Instead of training GPT-3 with 178B weights, you train a 25x smaller model and allow it to retrieve useful snippets from a large text index as additional information.
This solves the problem of very large models and the problem of updating an already trained model, as you can swap the text corpus with a newer one. The model learns mostly syntax, burning less trivia in its weights than a regular LM as it can simply copy the relevant information from the index.
This development was bound to happen as large LMs are expensive to use and it was an obvious idea. We've had these semantic search text indices for a few years already[1], they just weren't combined with text generation.
Surely this is a function of location? I understand the U.S.-English term “person o color” to be convoluted language for “not white”. One simple thing I notice is that if I search for, say, “child” on Google Image Search, the images indeed tend to look as what one would expect from the average inhabitant of an English-speaking nation, when I search “子供”, I indeed mostly see what I would expect from Japan. Similarly, if I search for “house”, what I find tends to look like a house most likely situated in the Netherlands; with “บ้าน”, it does resemble more so stereotypical Thai architecture.
I would assume that a.i.'s made in, say, Japan would yield different results.
The idea that AI itself can be biased (as opposed to the dataset) also has some significant problems. The lead of Facebook AI Research got canceled on Twitter because he pointed out that it's the bias in the dataset used to train the AI that results in bias in the AI and not the AI itself that's biased. I'd also question whether Gebru is a "widely respected leader in AI ethics research". Model interpretability is not even close to a solved problem so just because you can demonstrate some correlation between images of black people and worse performance does not imply that "black person" is a causative factor. It could literally be dataset distribution or image contrast or any number of other plausible explanations that are easily fixable by an ML engineer.
Claiming that "the AI is not biased, the training set is" is like saying "this running program isn't buggy, it's just the source code that is buggy".
And normally, that's harmless - as you said, you'd expect to see an AI finding pictures of houses in the region/culture you are searching it. But in a multi-cultural/multi-ethnic society, searching for "people" and showing up only what is considered the "majority" has a whole different lot of ethical implications.
Identifying and ideally remediating such issues is why ethics research is so sorely needed.
I am not actually; I am searching for “huis”, not “Nederlands huis”; I'd expect the result I obtain from the former with the latter.
I'd actually expect “house” and “huis” to reveal similar results from a good search engine. Obviously this is not easily possible with how it is trained with corpora in a specific language, but from usability I think this is undesirable, if I specifically want Dutch houses I can always add that term as a specification; there is no way to simply search for houses, wherever they might be, in Dutch, or English, or Thai, or any other language.
That is to say, I'm not arguing that there is no problem; I'm arguing that the problem is highly dependent upon location, and that he article should not take such a U.S.A.-centric stance and act as though the reset of the world not exist.
This is probably why there is more variance when searching for English terms as wel, as a Lingua Franca. If I search “house” I do see some styles of architecture not commonly found in Anglo-Saxon nations, whereas all occurrences of “huis” do seem to be situated in the Netherlands.
I can only quote Joy Buolamwini on this:
“To fail on one in three, in a commercial system, on something that’s been reduced to a binary classification task, you have to ask, would that have been permitted if those failure rates were in a different subgroup?”
Come on, if you've worked at any large company using ML you know model performance is literally just taking the average accuracy/ROC/precision/etc over your training dataset plus some hold out sets. Then you track proxy metrics like engagement to see if your model actually works in production. At no point does race come into the equation. Naturally, if your choice of subgroup happens to not be a large proportion of either the dataset or the userbase then you don't see the poor performance on that subgroup show up in your metrics so you don't care to fix it.
As humans age we apparently lose the former but compensate with the latter as best we can.
That is clearly not possible, so it can't be what they are doing.
Rather than diffusely encoding that knowledge in a massive number of self-organized layers of weights, it is explicitly encoded. The remaining network can "focus" on mapping input to retrieve the relevant information stored in that database, and extracting/interpolating/extrapolating that information based on the current context to generate useful output.