It is dangerous, part of the reason that we haven't productized that further. One of the ideas we had to productize the capabilities further was to leverage edge / lambda functions to compartmentalize the code generated. (Plus it becomes a general extensibility for folks that are not using semantic code generation and simply want to write their own code.)
The idea of auditing the strategy is interesting. The flow that we have used for the semantic chunkers up to date has been along these lines where we :
1) Use the utility to generate the code snippets (and do some manual inspection)
2) Test the code snippets against some sample text
3) Validate the results