Using them for an RPG campaign could work if the bar is low and it's the first couple of times you use it. But after a while, you start to identify repeated patterns and guard rails.
The weights of the models are static. It's always predicting what the best association is between the input prompt and whatever tokens its spitting out with some minor variance due to the probabilistic nature. Humans can reflect on what they've done previously and then deliberately de-emphasize an old concept because its stale, but LLMs aren't able to. The LLM is going to give you a bog standard Gemini/ChatGPT output, which, for a creative task, is a serious defect.
Personally, I've spent a lot of time testing the capabilities of LLMs for RP and storytelling, and have concluded I'd rather have a mediocre human than the best LLMs available today.