Internet is basically a bunch of nodes with interconnected data. Any search index might be considered as an interconnected graph. Same way with LLMs, it’s nothing new, might be considered faster depending on the definition.
Comparing to celebrities, data about people are so sparse that it would look like noise. I would be surprised if it encoded anything useful.
Half of the internet and all media were obsessed with making it say the f-word and tricking it into saying that it would kill all the people. Attacking from the privacy is quite obvious but I didn’t see it mentioned. I asked about 10 random friends and myself and it didn’t recognize the names despite having plenty of search results.
In one of the interviews they mentioned that it was trained on 10% of the web. So should have enough data.