1
Ask HN: How do you add guard rails in LLM response without breaking streaming?
Hi all,
I am trying to build a simple LLM bot and want to add guard rails so that the LLM responses are constrained.
I tried adjusting system prompt but the response does not always honour the instructions from prompt.
I can manually add validation on the response but then it breaks streaming and hence is visibly slower in response.
How are people handling this situation?