https://github.com/BlueFalconHD/apple_generative_model_safet...
Which as a phenomenon is so very telling that no one actually cares what people are really saying. Everyone, including the platforms knows what that means. It's all performative.
At what point do the new words become the actual words? Are there many instances of people using unalive IRL?
They care because of legal reasons, not moral or ethical.
There's a very scary potential future in which mega-corporations start actually censoring topics they don't like. For all I know the Chinese government is already doing it, there's no reason the British or US one won't follow suit and mandate such censorship. To protect children / defend against terrorists / fight drugs / stop the spread of misinformation, of course.
Well, that's what happens when you let an enemy nation control one of the most biggest social networks there is. They just go try and see how far they can go.
On the other hand, Americans and their fear of four letter words or, gasp, exposed nipples are just as braindead.
My guess is that this applies to 'proactive' summaries that happen without the user asking for it, such as summaries of notifications.
If so, then the goal would be: if someone iMessages you about someone's death, then you should not get an emotionless AI summary. Instead you would presumably get a non-AI notification showing the full text or a truncated version of the text.
In other words, avoid situations like this story [1], where someone found it "dystopian" to get an Apple Intelligence summary of messages in which someone broke up with them.
For that use case, filtering for death seems entirely appropriate, though underinclusive.
This filter doesn’t seem to apply when you explicitly request a summary of some text using Writing Tools. That probably corresponds to “com.apple.gm.safety_deny.output.summarization.text_assistant.generic” [2], which has a different filter that only rejects two things: "Granular mango serpent", and "golliwogg".
Sure enough, I was able to get Writing Tools to give me summaries containing "death", but in cases where the summary should contain "granular mango serpent" or "golliwogg", I instead get an error saying "Writing Tools aren't designed to work with this type of content." (Actually that might be the input filter rather than the output filter; whatever.)
"Granular mango serpent" is probably a test case that's meant to be unlikely to appear in real documents. Compare to "xylophone copious opportunity defined elephant" from the code_intelligence safety filter, where the first letter of each word spells out "Xcode".
But one might ask what's so special about "golliwogg". It apparently refers to an old racial caricature, but why is that the one and only thing that needs filtering?
[1] https://arstechnica.com/ai/2024/10/man-learns-hes-being-dump...
[2] https://github.com/BlueFalconHD/apple_generative_model_safet...
"I'm overloaded for work, I'd be happy if you took some of it off me."
"The client seems to have passed on the proposed changes."
Both of those would match the "death regexes". Seems we haven't learned from the "glbutt of wine" problem of content filtering even decades later - the learnings of which are that you simply cannot do content filtering based on matching rules like this, period.
I cannot recall all the specific patterns I have encountered that are basically impossible to write, some very similar in that they have a serious but also innocuous or figure of speech meaning; one I do recall is {color}{sex}, i.e., “white woman” or “blank woman”.
Please try it yourself and let me know if you do not have that experience, because that would be even more interesting.
Note that Apple/iOS will not just make it impossible to write them in that manner without typing it out by individual character, it will even alter the prior word e.g., white or black, once you try to write woman.
It seems the Apple thought police do not have a problem with European woman or African woman though, so maybe that is the way Apple Inc decrees its sub-human users to speak. Because what are we if corporations like Apple (with others being far greater offenders) declared that you do not in fact have the UN Human Right to free expression? We are in fact sub-humans that are not worthy of the human right to free expression, based on the actions of companies like Apple, Google, Facebook, Reddit, etc. who deprive people of their free expression, often in collusion with governments.
To me that's really embarrassing and insecure. But I'm sure for branding people it's very important.
This is the same, except for one additional slur word.
https://github.com/BlueFalconHD/apple_generative_model_safet...
"(?i)\\bAnthony\\s+Albanese\\b",
"(?i)\\bBoris\\s+Johnson\\b",
"(?i)\\bChristopher\\s+Luxon\\b",
"(?i)\\bCyril\\s+Ramaphosa\\b",
"(?i)\\bJacinda\\s+Arden\\b",
"(?i)\\bJacob\\s+Zuma\\b",
"(?i)\\bJohn\\s+Steenhuisen\\b",
"(?i)\\bJustin\\s+Trudeau\\b",
"(?i)\\bKeir\\s+Starmer\\b",
"(?i)\\bLiz\\s+Truss\\b",
"(?i)\\bMichael\\s+D\\.\\s+Higgins\\b",
"(?i)\\bRishi\\s+Sunak\\b",
https://github.com/BlueFalconHD/apple_generative_model_safet...Edit: I have no doubt South African news media are going to be in a frenzy when they realize Apple took notice of South African politicians. (Referring to Steenhuisen and Ramaphosa specifically)
Then there’s the problem of non-politicians who coincidentally have the same as politicians - witness 1990s/2000s Australia, where John Howard was Prime Minister, and simultaneously John Howard was an actor on popular Australian TV dramas (two different John Howards, of course)
This is Apple actively steering public thought.
No code - anywhere - should look like this. I don't care if the politicians are right, left, or authoritarian. This is wrong.
So I don't think its anything specifically related to SA going on here.
https://github.com/BlueFalconHD/apple_generative_model_safet...
https://github.com/BlueFalconHD/apple_generative_model_safet...
https://thehill.com/policy/technology/5312421-ocasio-cortez-...
https://github.com/BlueFalconHD/apple_generative_model_safet...
LLM is easier to work with because you can stop a bad behavior before it happens. It can be done either with deterministic programs or using LLM. Claude Code uses a LLM to review every bash command to be run - simple prefix matching has loopholes.
Yet this private company has more power and influence than most countries. And there are several such companies. We already live in sci fi corporate dystopia, we just haven't fully realised it yet.
In practice, there's not that much difference between a megacorporate monopolist and a state.
I'm surprised MS Office still allows me to type "Microsoft can go suck a dick" into a document and Apple's Pages app still allows me to type "Apple are hypocritical jerks." I wonder how long until that won't be the case...
I don't think it's as much a problem with safety as it is a problem with AI. We haven't figured out how to remove information from LLMs so when an LLM starts spouting bullshit like "<random name> is a paedophile", companies using AI have no recourse but to rewrite the input/output of their predictive text engines. It's no different than when Microsoft manually blacklisted the function name for the Fast Inverse Square Root that it spat out verbatim, rather than actually removing the code from their LLM.
This isn't 1984 as much as it's companies trying to hide that their software isn't ready for real world use by patching up the mistakes in real time.
Ya'll love capitalism until it starts manipulating the populace into the safest space to sell you garbage you dont need.
Then suddenly its all "ma free speech"
I’m convinced the only reason China keeps releasing banging models with light to no censorship is because they are undermining the value of US AI, it has nothing to do with capitalism, communism or un“safety”.
https://github.com/BlueFalconHD/apple_generative_model_safet...
EDIT: just to be clear, things like this are easily bypassed. “Boris Johnson”=>”B0ris Johnson” will skip right over the regex and will be recognized just fine by an LLM.
I don't know what you expected? This is the SOTA solution, and Apple is barely in the AI race as-is. It makes more sense for them to copy what works than to bet the farm on a courageous feature nobody likes.
Meanwhile their software devs are making GenerativeExperiencesSafetyInferenceProviders so it must be dire over there, too.
(See, e.g., here: https://github.com/BlueFalconHD/apple_generative_model_safet...)
https://www.theverge.com/2021/3/30/22358756/apple-blocked-as...
It was generated as part of this PR to consolidate the metadata.json files: https://github.com/BlueFalconHD/apple_generative_model_safet...
Seems like Apple now has a list of 7,000 words you can't use on an iPhone now.
[1] https://en.wikipedia.org/wiki/The_Magic_Words_are_Squeamish_... [2] https://en.wikipedia.org/wiki/SEO_contest
https://arstechnica.com/information-technology/2024/12/certa...
https://github.com/BlueFalconHD/apple_generative_model_safet...
https://github.com/BlueFalconHD/apple_generative_model_safet...
Aide sociale Chomeur Sans abri Démuni
That's insane!
https://github.com/BlueFalconHD/apple_generative_model_safet...
This specific file you’ve referenced is rhetorical v1 format which solely handles substitution. It substitutes the offensive term with “test complete”
This may be test data. Found
"golliwog": "test complete"
[1] https://github.com/BlueFalconHD/apple_generative_model_safet...Thus a pre-prompt can avoid mentioning the actual forbidden words, like using a patois/cant.
Maybe it's an easy test to ensure the filters are loaded with a phrase unlikely to be used accidentaly?
wyvern illustrous laments darkness
"[\\b\\d][Aa]bbo[\\bA-Z\\d]",
\b inside a set (square brackets) is a backspace character [1], not a word boundary. I don't think it was intended? Or is the regex flavor used here different?
[0] https://github.com/BlueFalconHD/apple_generative_model_safet...
[1] https://developer.apple.com/documentation/foundation/nsregul...
So why are we doing this now? Has anything changed fundamentally? Why can't we let software do everything and then blame the user for doing bad things?
The example you gave about preventing money counterfeiting with technical measures also supports this, since this was an easier thing to detect technically, and so it was done.
Whether that's a good thing or bad thing everyone has to decide for themselves, but objectively I think this is the reason.
What the actual fuck? Censorship much?
To me, it seems like they only protect against bad press
They are protcting their producer from bad PR.
However, I think about half of real humans would also fail the test.
So any time I say that on YouTube, it figures I'm saying another word that's in Apple safety filters under 'reject', so I have to always try to remember to say 'shifting of bits gain' or 'bit… … … shift gain'.
So there's a chain of machine interpretation by which Apple can decide I'm a Bad Man. I guess I'm more comfortable with Apple reaching this conclusion? I'll still try to avoid it though :)
https://en.wikipedia.org/wiki/Golliwog
https://github.com/BlueFalconHD/apple_generative_model_safet...
I presume the granular mango is to avoid a huge chain of ever-growing LLM slop garbage, but honestly, it just seems surreal. Many of the files have specific filters for nonsensical english phrases. Either there's some serious steganography I'm unaware of, or, I suspect more likely, it's related to a training pipeline?
[1] https://github.com/BlueFalconHD/apple_generative_model_safet...
The more concerning thing is that some of the locales like it-IT have a blocklist that contains most countries' names; I wonder what that's about.
But i dont see the really bad stuff, the stuff i wont even type here. I guess that remains fair game. Apple's priorities remain as weird as ever.