had a convo where it admits it has a left leaning bias due to the data it was trained on which was due to the selection process of the devs that chose the data for it to be trained on. here is the chat link for you to read. looks like chat gpt isnt as neutral as you may think. https://chatgpt.com/share/66e43f7f-8f24-8007-875c-a83ee3fa4d5a
#GPT admits it has left leaning bias
1 messages · Page 1 of 1 (latest)
There's one thing they could've not infected politics in but no they absolutely needed to
It's the data and prompts apparently. Is that really bias or taking academic information at higher value...
CopilotAI " Thank you for providing the full text. I'll review it and create two summaries - one with 30 words and another with 70 words, making any necessary corrections.
30-word summary:
Green hydrogen facilities are operational worldwide, powered by renewable energy. Examples include SoHyCal in California, HyNet North West in the UK, and projects in South Korea, Chile, and Germany.
70-word summary:
Green hydrogen production facilities are operational globally, demonstrating the potential of this sustainable energy source. Notable examples include North America's largest facility, SoHyCal in California, and HyNet North West in the UK. Other significant projects are located in South Korea (Hyundai Hydrogen World), Chile (Haru Oni Project), and Germany (REFHYNE at Shell Rheinland Refinery). These facilities use various renewable energy sources like wind power to produce hydrogen without relying on nuclear energy, contributing to the transition towards cleaner energy solutions.
The information provided appears to be accurate, and no significant corrections are necessary. The summaries capture the key points about the operational green hydrogen facilities mentioned in the original text.
Sorry. Wrong Screenshot
@hollow rivet @viral jasper @sour tide
I don't disagree that GPT has a left leaning bias, but I'm curious, what does an "unbiased" model look like to you? How many viewpoints must each answer represent for the outputs to not count as "biased"? I mean, if you ask a question about drinking alcohol and it doesn't give an answer from the POV of a salifist muslim, is that biased of the model?
It means anything to do with politics. drinking alcohol isn't that bud
politicians decide what the drinking age is, or if there's any age you're allowed to drink at at all.
politic is when they put black person in my suberhero movie 😡 🤬 
disney with??? woman? wtf 😡
look you're making this more complicated than it should be. Politics isn't equal to what you or when you can drink because that's not even a minor issues when it comes politicians yapping away.
what I mean is that the things you see as givens aren't the default worldwide. is it anti-kuwaiti bias to not mention, whenever alcohol is brought up, that alcohol is illegal in kuwait?
what does an unbiased model's output look like? if it just defaults to how things are generally accepted in america or the west more broadly, isn't that a pretty significant bias?
I don't think it's really possible to have an unbiased model. if you really represented every point of view equally, the model would give such verbose answers that it'd be practically useless.
An unbiased model looks like not giving major left or right leaning talking points as answers
there are still plenty of things the american right and american left agree on though, like that alcohol should be legal at a certain age, that smoking should be legal at a certain age, that women should have the right to vote, and so on. these aren't globally accepted natural truths. so how should an unbiased model respond when asked about topics like these, where americans generally all agree on a certain answer but the rest of the world doesn't necessarily feel the same?
if they agree when it's not leaning towards anyone which would be acceptable. A model should respond with information about the subject and not input what right and wrong about such subject, an example would be immigration, a model should explain what immigration is rather than input a parties talking points
What do you mean? what kind of decisions are you talking about when everyone agree on something there's no deciding
Sorry for the late reply in advance. I actually reached out to OpenAI because what I use it for -mainly the API- requires as little bias as possible. They mentioned that, yes, it is biased and that its due to the data used in training and they actively try to limit the bias as one of their main policies/goals.
Now about roleplaying, it does not roleplay on everything, however, there are safeguards put in place in its system prompt which makes it avoid giving answers in regards to it being biased. Unfortunately, in order to get over that you have to keep proding. I tend to lean right, rather than left, so if it was even roleplaying to please me, it would have said lean right, based on the memories saved it has of me.
Now to limit bias, in AI training, the data chosen must be filtered through various ways before it being used for AI training. OpenAI, on the other hand, used the 570GB of text data to train GPT models, out of that, 70% came from Common Crawl, which is already biased in that it has a lot of data from early days of internet/early 2000's for which most users of internet at the time are progressive, forward-thinking, and left-leaning individuals. Hence the intricate issue of bias is naturally embedded into the training data set. only 30% was then used from books/wikipedia/articles to diversify the training data. Out of those, I am assuming they did a good job at mitigating bias.
btw, I refine data as a job, as if you want less-bias AI, smaller models tend to be a lot better because the more data you use, the less control over it you have.