#For other nerds wanting some ways to
16 messages · Page 1 of 1 (latest)
Trajectories = What Chat Names Sessions. Policy = Triggerwords (edited)
Action Space: Training Space Objectives (I.E. Summarization/etc)
State is like the Session, while Observation is like your Input.
The Reward System!
The RL Problem (RL is the Model we are talking about in terms of Chat Bot/AI Training)
Values and Functions
Is there any way to bookmark threads?
I can send you the text in a DM if you like
How GPT might use NLP to search your input.
We’ve trained a model called ChatGPT which interacts in a conversational way. The dialogue format makes it possible for ChatGPT to answer followup questions, admit its mistakes, challenge incorrect premises, and reject inappropriate requests. ChatGPT is a sibling model to InstructGPT, which is trained to follow an instruction in