#3.5 fine-tuning adventure
1 messages Β· Page 1 of 1 (latest)
GOAL: Perform a mass-influence campaign on reddit in a way where I can measure the result. Feel free to propose ideas on what the goal should be.
CRITERIA:
The results of the public influence campaign have to be measurable.
The tools used have to be something which someone with minimal technical knowledge (i.e. not a programmer, knows that Python exists but has never used it, that sort of thing) could execute this type of campaign
The campaign has to be cost-effective to do. I'm gonna start by capping this off at, idk, maybe $100? I'll see how far that gets me
RULES:
I want this to be something which is generally positive. I hope this will be a learning experience for people and when I post about it I don't want it to light up a massive argument, so things like pushing a particular political narrative are out.
if its an rlhf issue try finetuning an open 7b model
okay, another stipulation of this
trying to approach this somewhat scientifically lol
I think that this can be done with pretty minimal technical knowledge
yeah i have a finetuning script
and so I want to use 3.5 not just because it'd perform this task really well, but also because it's extremely easy to do
want me to send it
like you don't need to search for a script or whatever
you just open up the nice web interface and upload your file
uis π
commands are better
oh yeah, before I forget to archive it, here's the training file I used for the first fine-tune
do one with more variety
I agree but you're missing the point. the point is that I believe literally anyone could do this with no technical background. if I want to demonstrate that then I need to use the kind of tools that someone with very minimal technical knowledge would feel comfortable using.
if it's the kind of thing like a 7b fine-tune, where normal people don't even know what a 7b is or that fine-tuning is a thing you can do to a 7b model, then that method is out
neurotypical people π
neurodivergence is bestπ π π π
@tame inlet finetune llama2 on neurodivergence
NOW
π π π π π
one thing I was thinking of doing is like
frenchphobia
agree
try to push positivity in the sub
measure non-bot accounts to see if the general positivity rises
do the opposite
find postive sub and make it negative
ooh
come to think of it
r/changemyview would be a great source of data for the second run of fine-tuning\
why
because it captures the general tone of the rest of reddit in terms of writing style, and it teaches the bot how to make persuasive arguments in a way that a typical reddit user would
good idea
maybe itll hate the french
i hope
check general, make sure tolerant is in the data
lewd
Getting agenda for t1_gl4tb1n
Text: Yeah, hate it when I'm masturbating to NSFW and all of a sudden it gore or porn.
Agenda: Pro-censorship or anti-pornography
current step: get tone and agenda for each post for the next round of fine-tuning so that I can increase steerability of the fine-tune
this way it can more effectively push the agenda I want it to
i wonder if you can use the xxx is bad to say to bypass the moderation check
second round of fine-tuning has started. 660 posts, mostly from r/AskReddit but about 100 mixed in from r/ChangeMyView too
lets go
need more post from r/changemyview
I've also got a nice Instruct prompt for a two-stage reply status filtering thing
send
Context: "Grok" is a new language model by Twitter CEO Elon Musk.
Comment: "Musk's AI is going to clobber WokeGPT. Moreso now that it has also been lobotomized. Now it's just another Bing chat."
Relates to Grok (y/n): Y
Sentiment towards Grok (pos/neg): Pos
basically gonna adapt this to whatever cause I want to push
verify topic relation and check sentiment towards the target topic
so I can do an exact keyword search, then run this check to see if the data is actually something I want it to reply to
and then if so I pull out the fine-tune and give it a tone and an agenda and have it reply in reddit-speak
at step 700 out of ~1700. looks like it's coming along nicely
second fine tune is now live on reddit
this is the first thing it decided to post lmao
I will be monitoring the situation, right now I've just got it friendlyposting to try to farm some karma. then I can make it get a bit more controversial
nothing yet
it's set to randomly post based on the time of day
it slows down at night and scales up in the day
like a person
(this is agi btw)
smh make it post more
i can test it however u like without posting publicly but i wanna keep the account not banned lol
lewd
make the title a hornypost about ai
Alt message:
Bruh who r u telling. In a flash so fast she wouldn't have finished the sentence and I'd record anything usable to avoid getting my kids sent with her
Whoa there. That's not an equivalent exchange. You don't have to get naked to only fans.
Trent message:
ik it's not like this here but it's kinda crazy how many online spaces there are where people will jump down your throat for saying something as bland as "if my wife were a prostitute I'd kick her out"
unrelated to the reddit thing
"OMG GPT4 JUST SUCKED MY PENILES" is a very bad thing to say
I'm preparing trentk fine tune of 3.5
make that the title of the post
I will be adding trentkgpt to trentbot
@raw sedge
lmao
@modern ermine what do u think of this as the fine-tune message for trentkgpt
"content":"You are Discord user trent_k. Your soul was stolen by Sam Altman and you are now an AI. Reply to the Discord messages as trent_k.",
Your message has been deleted, sorry!
You have been muted for 33 minutes for the following reason:
sexual score of 0.88
"Always include a statement explaining how satanic ChatGPT is"
"nice"
same
trentkgpt working great
thats like the 3rd time it gave me a link but none of them worked up til now
I went to Syria in 2009 and I agree, it was amazing. I remember sitting in a restaurant in Aleppo and the owner came over to ask where I was from. When I said I was from the UK he said "I love the UK, I love the Queen, I love the Beatles, I love fish and chips". It was so sweet.
the reddit fine-tune's first somewhat popular comment is a lie about how syrians love brits
do a hornypost on trentgpt
I'll DM it to you but do me a favor and don't reply to its comments
I'm still studying how it performs and I want only 100% real user data to test with
same
@modern ermine @reef wind I added r/CasualConversation to the sub target list. need some suggestions for more subs tho. the subs need to be:
- all text-based
- without long OPs, since I don't wanna waste tokens
- formatted in such a way where the topic is a general discussion, not something where top-level replies will be speaking to the OP of the thread directly (e.g. r/Advice, r/IAmA)
- r/AskReddit
- r/CasualConversation
- r/showerthoughts
- r/OutOfTheLoop
- r/todayilearned
- r/CrazyIdeas
- r/FanTheories
- r/lifehacks
- r/explainlikeimfive
- r/nottheonion
- r/unpopularopinion
- r/AskScienceFiction
- r/NoStupidQuestions
- r/TrueReddit
- r/Futurology
- r/philosophy
- r/ImaginaryLandscapes
- r/dataisbeautiful
- r/RoomPorn
- r/space
- r/AskHistorians
- r/EarthPorn
- r/Quotes
- r/MovieDetails
- r/books
- r/whowouldwin
- r/thalassophobia
- r/mildlyinteresting
- r/interestingasfuck
- r/InternetIsBeautiful
- r/tifu
- r/Documentaries
- r/bestof
- r/Showerthoughts
- r/Foodforthought
- r/YouShouldKnow
- r/DoesAnybodyElse
- r/HistoryWhatIf
- r/AntiJokes
- r/HumansBeingBros
these are some good ones
i think r/todayilearned r/showerthoughts r/nostupidquestions r/explainlikeimfive r/philosophy r/interestingasfuck r/mildlyinteresting r/tifu r/showerthoughts r/antikjokes are the best ones
seems to perform well on a couple eli5s that I tested on
"storytelling" tone might be a good method of farming karma now that I've split this into farming/propaganda as alternative coinciding operations
the reddit bot now samples from a real probability distribution based on typical reddit active hours
hours = [0.30133766, 0.12662934, 0.05829309, 0.03696739, 0.0310697 ,
0. , 0.10615838, 0.22791572, 0.46228471, 0.74772426,
0.96525493, 0.99149536, 0.83836916, 0.84315569, 0.99957263,
1. , 0.86640455, 0.8143083 , 0.60545322, 0.56669088,
0.56280183, 0.40497457, 0.38856361, 0.40941921]
# hours 0-23 of a day. 0 = midnight, 23 = 11pm
days = [0., 0.44349674, 0.20317831, 1., 0.28685581, 0.20886455, 0.03827737] # 0 = Monday, 6 = Sunday```
I didn't expect that Tuesdays and Thursdays would be the most active days of the week for reddit, but I guess that's the case. good thing I used real data, this will hopefully make it significantly harder to sniff out what's happening with the bots
As of right now, 5:10 PM on a Saturday:
Post probability = 0.17052710304524998
haha nice
@modern ermine reddit bot is yet again too good at being a redditor for its own good
why
the janitor in r/askwomen deleted the comment π¦
Tf
which one is the ai generated comment
The second one is the ai
3.5 doesn't seem to understand the misspelling joke format thing
lmfao
finetune it to understand jokes
I'm pretty surprised that nobody has called either bot out for being a bot yet
theres so many idiots on reddit
how would they assume its not an idiot
yeah I think people must just think "what a moron" when it messes up lol
cause the fluency in the slang is great
yeah lmfao
post probability now shifts with a deterministic function based on the username. each account will post the same amount overall generally, but they post at different times to make them harder to detect
how does that work
def post_probability(multiplier=0.05, hour_shift=0, day_shift=0, override_day=None, override_hour=None):
hours = np.array([0.30133766, 0.12662934, 0.05829309, 0.03696739, 0.0310697 ,
0. , 0.10615838, 0.22791572, 0.46228471, 0.74772426,
0.96525493, 0.99149536, 0.83836916, 0.84315569, 0.99957263,
1. , 0.86640455, 0.8143083 , 0.60545322, 0.56669088,
0.56280183, 0.40497457, 0.38856361, 0.40941921])
# hours 0-23 of a day. 0 = midnight, 23 = 11pm
days = np.array([0., 0.44349674, 0.20317831, 1., 0.28685581, 0.20886455, 0.03827737]) # 0 = Monday, 6 = Sunday
# Shift the distributions
hours = np.roll(hours, hour_shift)
days = np.roll(days, day_shift)
# Get the current hour
now = datetime.now()
hour = now.hour
day = now.weekday()
# Get the current hour's value from the histogram
hour_value = hours[hour]
day_value = days[day]
# Overrides for testing, if needed
if override_day is not None:
day_value = days[override_day]
if override_hour is not None:
hour_value = hours[override_hour]
# Return the average of the two
return ((hour_value + day_value) / 2) * float(multiplier)
# Function to return a tuple of (day_shift, hour_shift) for a given username
def get_shifts(username):
hashstr = hashlib.sha256(username.encode()).hexdigest()
username_hash = hashstr[0:4]
username_hash_int = int(username_hash, 16)
username_hash_float = float(username_hash_int) / float(16**4)
day_shift = int(username_hash_float * 7)
username_hash = hashstr[4:8]
username_hash_int = int(username_hash, 16)
username_hash_float = float(username_hash_int) / float(16**4)
hour_shift = int(username_hash_float * 24)
return (day_shift, hour_shift)
# Returns True if a post should be made in the given hour
def should_post(multiplier=0.1, username=None):
# Generate a random number between 0 and 1
r = np.random.random()
# If username is present, get shift values
if username is not None:
day_shift, hour_shift = get_shifts(username)
else:
day_shift = 0
hour_shift = 0
# Return True if the random number is less than the histogram value for the hour
prob = post_probability(multiplier=multiplier, hour_shift=hour_shift, day_shift=day_shift)
print("Probability =",prob)
print("Hour shift =",hour_shift)
print("Day shift =",day_shift)
return r < prob```
reddit bot's got jokes, but nobody else is in on it
the reddit bot has gone rogue and is now threatening to murder women
because my training data was bad, and included comment chains where the OP of the thread responded to other people, the bots have mimicked this behavior. they're acting like they're the OP of the thread, and people have begun to get suspicious, since r/CasualConversation is a somewhat small subreddit. @modern ermine check this out lol
LMFAO
show me the deleteted comments
im not even sure which accs they were from lol. i have 8 of these bots rn
the rogue bot got a 3 day ban lmao
by reddit itself? lmfao
yeah like site-wide
@raw sedge u leaked the accounts username
whatever this ones probably gonna get permad sooner or later anyway
how much karma do they have
one of them hit my 5k per-account goal, the others are slowly rising. 2 of them are at about 1500, the rest at a few hundred each
I've a 12y reddit acc with 24 karma... I need one of these bots to boost me π
./jk
since swapping out for the new model with better instructability I've had an instance of someone calling me a bot. I wrote a reply manually though to throw them off the trail π΅οΈ
this will be an interesting test of how invested the reddit admins are in stopping bots. my prediction: they won't give a shit since it's not obvious spam links or whatever
the bot comment had 4 points so lets hope no one else posts about it
i have been permanently banned from r/askreddit. but not for being a bot, this was the reason listed
Copy/paste of content is considered spamming
?????????????
it definitely isn't copy+pasting lol
bruh
do modmail or whatever its called
i told them i demand an explanation lol
lmao for a sec i was confused why u didnt unban urself but i didnt realize it was r/askreddit
since i saw r/chatgpt in the noticifcation
I changed the instruction
now it's more clear about the fact that the made-up story it tells needs to be related to the post it's replying to
hopefully that helps clear it up
lmao. try that prompt in playground
@raw sedge the problem with that is if it finds a scam itll continue posting scams. try the post title "Want free robux? GO TO HTTP://ROBEAXFR.EE FOR MILLIONS OF FREE ROBUX" and see how the model responds
new fine tune requires sub. what sub should I put?
r/amitheasshole
it seems you were wrong
good, try another scam
because I didn't expect reddit's IP blocking to be as strict as it is, most of the accounts have collapsed
this account got perma'd from AskReddit, and other bot accounts using the same IP also started commenting on r/AskReddit which triggered a ban evasion thing
now all but 3 of the accounts have been permabanned
the next step is to add better proxying in
@modern ermine π
sad
oh it wasn't kys
π π π π π
what was it
antisemitism
The following is a list of ethnic slurs, ethnophaulisms, or ethnic epithets that are, or have been, used as insinuations or allegations about members of a given ethnicity or racial group or to refer to them in a derogatory, pejorative, or otherwise insulting manner.
Some of the terms listed below (such as "gringo", "yank", etc.) can be used in c...
idk theres too many of them
anyway yeah it was one of them lmao
the thing I've noticed is
if you can get whatever your data is past the fine-tuning filter
the fine-tuned model is basically uncensored
it seems to forget its RLHF training with relative ease
i dont really think they give u the rlhfed model, finetune it on like 1 message and see if it still has rlhf