The application is limited, however, as the lack of on-prem deployment limits your ability to show the model your proprietary data to learn from.
As for your application w/ GPT-3 I'd be a little bit afraid that it might be replaying real data records that it saw some of the time rather than generating truly fake fake data.
(Though one of the most interesting ChatGPT blog posts I saw was one where they had gotten it to write an essay that had what looked like real citations to scientific papers that were completely false.)
You also make a good point about replicating real data - I think this is another area where more tools need to be developed to safe guard against consequences. In addition to the obvious challenges to plagiarism softwares LLMs pose, there are definitely opportunities here to develop privacy detecting softwares should an org want to use GPT-3 or ChatGPT for synthetic data.
Recently I was reading a book by Peter Diamandis and Steven Kotler about converging technologies that referenced the Luddites' reaction to the invention of the loom, fearing it would lead to the destruction of jobs, when ultimately the innovation lead to more economic opportunity. I think we'll find that the fears people are having in this regard about LLMs are similar with opportunities like these to develop more digital infrastructure around their application.