@tropic thistle I'd love to collaborate with you. I'm also new to the space as well and trying to get more knowledge
Thoughts
- I think some interesting vlaues
First Pass Ideas
- Perhaps having different AIs that have different values and having them operate in an adversairal why. So for example, we could have one AI where the valies we want to optimize for are clarity or truth, etc. You can then hyper optimize one for curiosity. I'd expect it would hallucinate more, but the other LLM could server as a fact checker
- What if we build some sort of absract syntax language that an LLM can use to describe it's "true" statement. Then we can traverse that tree and check each truth, turning the problems of truth into a somewhat recursive problem. This to me feels more deterministic, but I'm not sure how to structure this idea further
Questions
- How do we measure alignment? Alignment feels very subjective and one person could view an AI as aligning compared to someone else
- How do we work with subjective truths? Are we thinking about truths such as: Grass is green, or are we also considering truths that are subjective and differ person to person
- I would love to further explore some of the tests and metrics we use to work with AI
- The data we extract truth from can contain bias. I wonder if we can in some way with the syntax tree method i described above, have a way for it to detect contradictions