#Create community dataset to finetune models

3 messages · Page 1 of 1 (latest)

balmy owl
#

I think we should add an option to allow people to opt into collecting conversations. That way, over time, we could create a dataset that can be used to train custom models, eventually improving performance and becoming independant of GPT4 and OpenAI.

What's your thoughts?

nocturne lodge
#

Absolutely, as long as it's completely transparent and parsed for PII and such for privacy concerns.

astral verge
#

On the one hand, this would be a cool data source and might be able to help our own future plans and maybe even other projects in the future, too.

On the other hand, we already get a number of GitHub issues and messages in Discord about privacy concerns with the extremely limited stuff the codebase is already doing.

Something like this would need to be behind some serious opt-in barriers and documented heavily.