I am working on an AutoGPT/HustleGPT type of project that can control actions on my computer using the new ChatGPT Plugin framework. I call it CEOGPT, to make business decisions and act on them. I am still developing this but will try to open source it if I get enough interest from the community.
The system is based on a simple event loop. First a complete prompt is assembled by combining the following details:
- Initial Prompt - Preps ChatGPT on how to proceed. This is immutable.
- Datetime - Current time. Updates after every action by my API server.
- Largegoal - Overall large goal. ChatGPT updates this.
- Company Status - The status of the company. ChatGPT updates this.
- Recent Context - Recent actions taken by CEOGPT. ChatGPT updates this.
- Details - A file that stores the output of the last action performed. This is updated by the action itself.
- Available Actions - A list of actions that ChatGPT can execute. This is immutable. You can specify arguments/parameters here. ChatGPT will update a "Next Action" and "Action Input". These are important for my parsing engine.
- Narrow Goal - The current goal for the next action, this can be like the internal dialogue. ChatGPT updates this.
- Thinking - This is probably going to become a "Comment by Experienced CEO" that ChatGPT will generate. But it is meant to act as a more creative field that gets updated by ChatGPT that can provide some ideas for the relevant context.
- Ending Prompt - This will simply contain a question asking ChatGPT which action and action input to use from the available actions list. This is immutable.
After assembling this prompt, my program will simply copy to clipboard and paste it into the ChatGPT plugin chat interface. ChatGPT will hit my API endpoint to update the dynamic fields I mentioned above as well as a "Next Action" and "Action Input". The "Next Action" runs through a simple NLP classifier module I have created to map to a command. The descriptions that train the NLP neural net is defined in a JSON object. After getting the command to run and using the action input as the argument for that command, the command will execute on my computer. Some examples for these commands are as follows (still developing these).
- Perform Research on X (Does a google search, collects data, summarizes it)
- Respond to Emails
- Make announcement on Twitter/Discord/Reddit etc.
- Analyze Discord Community and Summarize
These commands are programs that can be created with any level of complexity or using the API even more in creative ways. But they all have to update the details.txt file (which goes into the assembly of the master prompt).
Developing this project to increase the capability of CEOGPT is a small ordeal. It involves creating individual scripts for each action (I am using xdotool as my primary macro program, but you can use any tool) that has to update the details.txt file afterwards. In addition to this, text sentences for the NLP neural net need to be updated in a JSON file to map the next action description to the right command. And finally you need to update the AvailableActions.txt file so ChatGPT knows it can pick that.
Results are definitely workable and surprisingly easy to develop with. The tricky part is deciding which actions to implement and at what level (ie individual button clicks or several clicks and API calls for a high level actions). But this is a very easy framework I am using to allow ChatGPT to perform actions. I believe ChatGPT plugins have very high potential for this kind of automation.
Please let me know if you have any ideas or creative input. I may open source this codebase if I see enough interest. It is being developed on Ubuntu.