Many of the problems with LLMs may be structural and intrinsic due to the way they work (probabilistic text generation) and their training data (often human-generated text that incorporates many features of human discourse that are undesirable in machine-generated output.)
The continual failures of "guardrails" show that it's incredibly difficult to get these systems to behave in reliable and predictable ways; unsupervised interactions with them are intrinsically unsafe, and should be treated as such.
Presumably Meta and others are trying to detect and prevent bad output and pathological interactions, but that detection is unlikely to be 100% accurate, and we've seen what the failure modes can look like.