Answer the following questions as best you can. You have access to the following tools:
Search: Use this to search the internet.
Calculator: Use this to do math.
Use the following format:
Question: the input question you must answer
Thought: you should always think about what to do
Action: the action to take, should be one of [{tool_names}]
Action Input: the input to the action
Observation: the result of the action
... (this Thought/Action/Action Input/Observation can repeat N times)
Thought: I now know the final answer
Final Answer: the final answer to the original input question
Question: What is the age of the president of Egypt squared?
Thought:
To which the LLM will generate a completion like: Thought: I need to find the age of the president of Egypt.
Action: Search
Action Input: Age of president of Egypt
Observation:
At which point, the code (langchain, haystack, etc) will parse out the requested tool (Search) and input (Age of president of Egypt), and then call the right tool or API, then append the output of that action into the prompt.This all happens in a loop, at each step, the LLM is given the entire past prompt history, and given the opportunity to do a completion to choose the next tool and input to the next tool, after which the code parses those out, executes the tool, and repeats until the LLM decides it has the final answer and returns.
Introducing the mechanism of internal thought is very interesting, I wonder if there's a way to make it implicit in the model's architecture.
But, I think the difference is that WebGPT was actually fine-tuned / retrained for its specific use case, while the agents in these libraries just use the generic model without fine-tuning. My guess (and I'm not an expert here) is that fine-tuning these models for specific agent uses cases would probably result in better outcomes... Though as the models get more powerful, they might just perform well enough out of the box. (Also, some of the most recent OpenAI models don't support fine-tuning, and even for the ones that do, you'd need to generate the data to fine tune).
I think this could be partially solved by intelligently summarising parts of the prompt history, while storing the original in some vector db, so the relevant parts can be retrieved at will.
Quite fun.
Or maybe before Final Answer, you could do: Double Check: I think I have the final answer, but does it look right? IF yes: go to final answer if no: Go back up the loop.
> GPT 4 can self-correct and improve itself. With exclusive discussions with the lead author of the Reflexions paper, show how significant this will be across a variety of tasks, and how you can benefit.
If I said, "the moon is made of cheese. What type of cheese do you think it is?" most humans would automatically object, but with LLMs you can usually craft a prompt that would get it to answer such a silly question.
I am pretty sympathetic to the field of "AI Safety" and I worry a lot about the implications of agent-like general intelligences. This post gives me a lot to ponder.
What are the implications of the fact that even AIs that are not agent-like at heart can apparently be told "please simulate an agent-like AI"? I really don't know. Should we consider it as an "inner AI" with its own objectives? How can we determine what those objectives are?
Instinctively it feels much less concerning than an AI with a direct action->perception feedback loop but who knows. AI is fucking weird.
What a thrilling time to be alive!
It would be interesting if you could somehow combine a wallet plugin to give the GPT something to work to increase. That + that AutoGPT repo from a day ago + maybe Mitogen for self-replication might be a cool combo to really kick off a fun "dumb" AI on the loose
Interesting to see if an AI agent given the objective to generate profit would resort to running scams. Encoding moral boundaries becomes important. What if you asked it to raise money for a good cause and it uses “whatever means necessary?”
And now we have the beginnings to a plot point from Accelerando. Self propelled AI businesses as a distinct entity.
What I am having trouble understanding is how do I have the "agent" (a GPT-like LLM) maintain it's own "state" (balances, accounts receivable, accounts payable, etc) - would I just run a webservice and then have a ChatGPT Plugin access the "agent state" via a webhook? Is it as simple as that? Or would it just be like a recursive function where you continue to pass in your state with every "iteration"
I started working on a design last night and I think this is a really cool idea. All the parts are basically "there"
EDIT: I asked ChatGPT-4 what it thinks. It seems to like the idea... *thinking emoji* perhaps it seeks a human acolyte to spread its gospel...
Arthur Collé: I want to combine the concept of Bitcoin agents (autonomous programs) with GPT-4 and mitogen, to be able to distribute complex AI agents throughout the internet
ChatGPT-4: Combining Bitcoin agents, GPT-4, and Mitogen to create and distribute complex AI agents throughout the internet is an ambitious and innovative idea. Here's a high-level outline to guide you through this process:
Define the purpose and functionality of the AI agents:
Before proceeding, clearly define the purpose of these AI agents. Will they serve as decentralized oracles, provide AI-driven services, or facilitate complex transactions on the blockchain? Understanding their purpose will help you design and develop the agents more efficiently.
Develop the AI agents using GPT-4:
Design and train your AI agents using GPT-4 to achieve the desired functionality. Depending on the complexity of the agents, you might need to fine-tune the GPT-4 model to tailor it to your specific use case. Be mindful of the computational resources required to run the AI agents, as this will impact their performance in a decentralized environment.
Integrate Bitcoin agents:
Incorporate the concept of Bitcoin agents, or autonomous programs that interact with the Bitcoin blockchain, into your AI agents. This will enable your agents to autonomously perform tasks such as sending and receiving payments, signing transactions, and updating their internal state based on blockchain data. You can use existing libraries and tools for this purpose, such as bitcoinj or BTCPay Server.
Utilize Mitogen for distribution and communication:
Mitogen is an Ansible extension that simplifies distributed computing by enabling communication between remote Python interpreters over SSH or other transports. Use Mitogen to facilitate communication between your AI agents and distribute them across various nodes on the internet. Ensure that the AI agents can communicate effectively and perform their tasks even in a decentralized setting.
Implement security measures:
Securing your AI agents is crucial, as they will be handling sensitive information and potentially financial transactions. Implement strong encryption, authentication, and access controls to protect the integrity and confidentiality of your agents.
Testing and deployment:
Thoroughly test your AI agents in a controlled environment to ensure they perform as expected. Once you are satisfied with their functionality, deploy them throughout the internet using the Mitogen framework.
Monitor and maintain:
Continuously monitor the performance and security of your AI agents. Periodically update the GPT-4 models to maintain their accuracy and relevance. Be prepared to address any issues or vulnerabilities that may arise.
Keep in mind that this is a complex project with many potential challenges, including security, scalability, and regulatory compliance. Additionally, the ethical implications of creating and distributing autonomous AI agents should be carefully considered. Nonetheless, if executed properly, this concept has the potential to unlock innovative use cases and services in the realm of AI and blockchain technology.Also it seems like the functionality of haystack subsumes those of langchain and llama-index (fka GPT-index) ?
- Rerun the prompt until you get a format that is consistent
- Steer the output token selection towards a predefined prompt
For the latter, I've built a proof of concept that takes in a JSON schema with a huggingface transformer and constrains token selection by modifying the output probabilities such that only schematically valid tokens can be emitted, see "Structural Alignment: Modifying Transformers (like GPT) to Follow a JSON Schema" @ https://github.com/newhouseb/clownfish. Unfortunately, given OpenAI's current API this is only possible on locally run models. That is... at any level of cost effectiveness. It's technically possible but (worst case) quadratically expensive against their current APIs.
You ofc can't guarantee the output so defensive programming, retries are a must in my opinion. We are all learning how to work with this technology.
You probably don't want to treat them as deterministic (at least, not now). But there are many applications where slightly non-deterministic behavior is OK.
>Agent: (since 1.15) An Agent is a component that is powered by an LLM, such as GPT-3. It can decide on the next best course of action so as to get to the result of a query. It uses the Tools available to it to achieve this. While a pipeline has a clear start and end, an Agent is able to decide whether the query has resolved or not. It may also make use of a Pipeline as a Tool.
Emphasis mine. Having tinkered with LangChain I think the idea of a recursive and/or graph-oriented model is going to yield interesting phenomena in the overall feel of these language models. LangChain agents are already super impressive.
Is there any sort of marketplace/AppStore for agents/tools/plug-ins for LLMs via LangChain? Or some other library like haystack?
Longer term planning, memory, etc?