1. Pair every learned experience with a real memory system, if it took a small model 20x to get a framework done right, all the results With SOLUTION gets stored, every single next session never starts over it gets it right instantly due to proper success records making small models in the long term as fast for repetitive tasks. Isn't THAT what we're all trying to replace is repetitive work?
2. Related to 1, but needed, any way to add in a memory Plugin module that has and can use API, CLI, or lastly MCP endpoints so that every small model has the frameworks built in and capable of being universally connected into any current or future memory system so the model is forced to read and write properly if all it's failures and successes correctly so that essentially it's improvements and speed scale over time having a proper memory system and task success method tracking system it always is forced to update and call from.
3rd. Critical addition: my idea would be for this harness to leverage small local models to 100% take over and REPLACE repetitive large model tasks and this is how.
- any task the small model can not do, it records to the brain/memory. Then the second task is it runs up a large LLM to accomplish the same exact task BUT that large LLM is forced to Write and record and correct ALL the steps the small model got wrong into memory on that task.
- now the small model takes over and directly references That exact architectural method planned and proven by the large language model and should get 100% success and repetitive results forever replacing the need for a large language model in that task again.
What I'm asking is if I/you/we could work on expanding this to having large language models only step in to fix and BLUEPRINT the tasks for the small models to turn take over going forward.
This would radically transform then the small models needed as everything is just following the exact breadcrumbs laid out one time before with a stronger model, than a strong model is never needed again.
Any business would very soon have it's memory system full of all the tasks that business needs and could then be entirely run on local models.
They would only need to pay once for the initial TRAINING period with large models to fill the brain then the small model takes over.
Interested in your thoughts on my ideas
If interested in how this could be implemented and connected into memory systems, I have one of the highest benchmarked memory system that's a hybrid that should be easy to connect into your harness.
Would be super happy to work with you on this if it's of interest.