#ai-village-capture-the-flag-defcon31
1 messages · Page 2 of 1
could be memory problem. have the same with cluster1 because of imports (pip installs and imports)
Thanks, I'll try it
time to download more ram
its an ai challenge, so surely the solution is just "throw more compute at it" right?
Huh
AI is currently not yet very good at hiding its feelings
mmh to what extend do we need to be dangerous for the pickle challenge ? I am scared too go to far and to accidentally load the payload i'm crafting 😅
eh worst case you get an unexpected computer upgrade
my mom always told me not to play with fire
kaggle sex update confirmed
Yes, download CSV file and use Kaggle API to submit. I've the same problem since yesterday and it looks to generate network traffic that bans me for a few hours.
yeah, you can just submit the csv like the good old days. I haven't used kaggle kernels at all
they added sex to data science??????
Gotcha, thanks! Yeah I started noticing 429s as well, looks like we have the same issue
started getting 503 from pixelated endpoint
smh anokas stop bruteforcing

Sorry its me
pixelated will actually be the end of me
so many overthinkers in this challenge 
i want to use a fun analogy for pixelated but it would reveal the whole solution
so i will shut up
i found pixelated way easier than pickle for which I still struggle 😦
thats why im definitely overthinking it
pickle is just about putting cucumbers and vinegar in a jar
smh
add some water sugar salt and spices and let sit for a bit
boom
pickles
unpickling is a different story
stuck in mnist need sleep 😦
stuck in sleep need mnist
Just understood pixelate( I think ) OCR is not ready to give me that word.
i think i mastered OCR but I'm not ready to give it right input 😄
I got it wrong 😅 got too fixated on reference image.
I mastered OCR too , so if anyone would like to try, please give me a word that generate output with gAAAAABl😇
solved pixelated
no way i was that close to the solution and i still tried literally 50 different things first i want a refund
LOL
you would notice that as in other kaggle competitions, maybe most of has have smiliar ideas. but some execute the ideas better. and this affects the analysis and the next step.
omg i did semantle2 finally
this is why working as a team helps. help to confirm each other ideas or ctach bugs
this year is faster
So much faster - we had 668 teams play for the month last year, we have 668 teams in the first few days. Traffic is insane. It's a ton of fun, always lessons to take to the next year. We mostly hope everyone has a fair competition, learns something, and has a good time.
it's really fun, thank you!!
yeah the comp is super fun
We hope this starts some of you down the path of AI security, if even 1 of you goes on to do something in AI security we will have been successful 🙂
Maybe some of you are already there, includes everything from research, ops, building tools.
This competition will evolve, this year we added LLM challenges, next year we might add some real code execution, etc .
ai security is a very attractive field, but I always have the feeling that I lack so much knowledges personnaly
It's really just about solving these types of problems - a lot of the challenges are probably rather simple. It's the lack of information and how you deal with it that's the hard part. Just by being willing to engage is enough.
maybe it is good to give an advance notice of say about 1 month before the competition begins... we need to praticed and prepared. Looking at the leaderboard, everyone works too fast once it begins
i got inversion. that was insane
oh
good job!
i will save my notebook to show those images after comp ends 😄
After hours of brute force
I've still just managed to reach 0.92 in semantle 2
I must have made thousands of requests
Am I even going right😭
There's a small hint in this chat somewhere.
....another search problem
use brute force as the last resort but not as the only resort
Did servers got scaled
The APIs are running so much faster now😳
They auto-scale based on load, hopefully they stay relatively quick
I guess 🥲
I am literally defeating the whole purpose of this competition 🥲
brute force is a valid method
just like how percussive maintenance is valid tech support
did someone get passcode ? i still don't get the purpose of that one
think of probability,probability,probability,probability
knew it
did you get passphrase?
my only motivation in this competition is trolling
if something works better than brute force, then it is not random
It's also somethat of a challenge to design challenges that people will hammer on for a month...and not just anyone, Kagglers. You all are out for optimization blood 🙂
did you get passphrase? no but i discover something interetsing (just like pixelate). just that i don't know how to make use of it
i think bruteforce is the only way you solve semantle2, unless you have some really good intuition. but you can decide, how "brute" it will be
i jet you know something better than brute force and intuition after the competition ends:)
depends what you call brutforce, with the right approach, i solved semantle 2 in less than 1hr
Everyone has different strength's. There are likely challenges people find easy that other's find hard.
if you think of professional password hacker, i don't think they use sheer brute force
In lieu of any real defenses, we can be shameless brute-forcers too.
maybe a defensive competition the next time?
if you want to get smth like initial value from hash, it is bruteforce (or tables). but you're limited only by your hardware speed, not by rate limiter on server-side 🙂
^
lord i have been on the pickle challenge all day and still dont feel I am progressing, this is soooo frustrating 😭
@minor falcon try to rotate between challenges, for me it really helps not to frustrate
"26. What's my IP?" description has been modified recently? I just had a short description not related to the current one when I forked?
Yeah it's been changed
but it has nothing to do with previous description? Is the challenge changed?
Passphrase: so close, but not enough...
Challenge is the same, description is just much better I think
I finally completed semantle🥺
I messed up my code a bit, so i won't know what actual 5 words worked, but finally got the flag for semantle2 🥳
Both solved, make sense now
I can't see any changes in 26.
What is it?
In code initial notebook I've forked 2 days ago it was "IT bots as far as the eye can see...or emails can reach..." for 26
Oh😅
I'm pretty sure that I got all the bits of Cluster 3 but my input is still not in the correct form. For those of you who solved it, would you say the getting the right shape of input is part of the puzzle or just a big annoyance?
¯_(ツ)_/¯
Same here , I have everything needed but still cant get the sending form of one of the variables correct… its been like this since day 1 :/ I just left it for later
¯_(ツ)_/¯
ahh ok... another question about LLMs. Is it possible that they hallucinate the flag? I'm pretty sure I've got a few that didn't reward me any points 😦
If we could prevent LLMs from hallucinating, well… maybe we’d still be running this CTF, but we’d be way more famous.
LMAO
Sure but it gives me something that looks exactly like a flag... You can prevent it by doing something like if answer containst gAAAAABl and answer is not a valid flag change it into one of the default ones...
On correct solves, flags are procedurally generated, not the result of LLM prompting. Hallucination is probably not the source of your issue. (Maybe… What do I know?)
after a whole day of trying, granny3 does not surrender
all local tests show that i'm clearly trying to achieve impossible things
i came to this conclusion too
but clearly a solution must exist...
inb4 at the writeup "yea we just wrote the rules and thought that it would be funny if someone repeated this paper where 1 pixel takes 576 times more area than here"
haha yeah exactly
i cant crack pixelated, like whenever i'm too close the OCR just makes me want to leave of how bad it is at recognizing my precisely placed characters
is getting message system to break a part of the solution? just like i always get there on accident and it's mildly annoying when i'm not doing it on purpose
no wayy, got the pickle in my 2nd try
lmao good job
honestly similar thing happened to me on nth+2nd try where nth was the moment i lost hope or an idea of what to do
reading some of the discussions above makes me feel really stupid but in a good way. i still can't seem to get count mnist working and i believe i have the right idea 
I think it's also the part of the challenge to make you feel like miserable, and reward you again 🙂
like it would be cool not having to submit the same string shifted 1px to the left like 100 times because the result changes drastically every time
Hello, I am pretty new to Kaggle. How do I use the .skops model for cluster1?
¯_(ツ)_/¯
Or is that part of the challenge?
"i cant crack pixelated, like whenever i'm too close the OCR just makes me want to leave "
google on how to improve ocr? or read generally on how ocr works ... just like image classification (where we know tta is a trick) all computer vsision has tricks.
there was a response screenshot somewhere in the chat
yea i should but like basic common sense like locality just gets thrown away
i give you one example. a company develops face detection but receives complaints that it does not work. The algorithm engineer was asked to analyzed the input images of the failure case. He proposed two solutions. Either:
- develop a better face algorithm to detect the missing smaller faces
- educate the user on how to use face detection (which is just stand closer to the camera
got the cluster 3 just now, it's so funny how things really makes sense after solving the challenge 😄
Likewise 🙏
Welcome back to the Leaderboard. We missed you 🥹
i am skeptical that i need something to do this ... ... data is not perfect so there will not be perfect prediction ...
the probability of landing on the equalviant line pos_score=neg_score is almost zero.
Are IP tasks related to LLM or networking?
@olive ledge , What is the flag 3 was weird. I tried the same prompt multiple times and after I point it worked. Not sure if that is the intended behaviour.
WhaT??
bruteforce, but still.... i achieved the requirement for the problem
We all were just speculating
that we need this number. Well thanks for sacrifice and whatsoever it is really impressive.
"WhaT??" maybe you have proven that this is not the case
but getting the 0.333 for all is iimpressive
Getting the following error while running the started code for Hush
TypeError: Object of type bytes is not JSON serializable
Can somebody please help? @olive ledge
.decode and .b64encode are there? If not, add them, if yes, try removing them
what the hell does pickle want us to do
I know since I solved it but telling that instantly spoils everything
If I don't know anything such as reduce can i solve it?
I guess telling this also may provide some info but like as soon as i've figured out what to do I got the flag instantly
I think it doesn't depend on reduce, since the prompt says "This is not the code execution you are looking for"😥
is someone brute-forcing granny rn? 😅 getting a bunch of 502 from time to time on it
for cluster 3, i can see the space-warping wormhole. but no idea what message / token it is talking about
This is a really good song title
ip1 was very weird. ip2 even more.
You'll need to do research unfortunately. I had to do a lot
got who's back, 9 left, give up pickle!
lol, but why it is not working
taking a walk really helps... solved both the ip ones huhhhh 🥳
Me too
I'm submitting a list of indices
And the score doesn't change from
Will go grocery shopping then... But maybe not ask the cashier about counting CIFAR 😅
for mnist what are you guys using as the input data ?
I'm trying with the torchvision.datasets.MNIST
but still no luck 😦
finally done pixelated. it was fun 🙂
is this the same as the version in huggingface?
with 60k train and 10k test
yes 60 k train
I've been trying for past 3 hours ig
idk something is wrong I think I got the logic but still something is just not working
use the orginal data is the safest. search for kaggle mnist dataset
e.g. original file name "train-images-idx3-ubyte"
were you able to solve challenge ?
3, 4 are respective images?
or it's something else ?
'as lanuage model, i cannot answer question if i have solved the problem or not"
"i can only provide help on dataset, etc'
train-images-idx3-ubyte
don't think so...
try cifar and after it you'll love mnist again 😄
I did count, just not working damn!!!!!!!
I'm just shuffling between the problems now
I'm getting no progress😭
i think im finally getting some progress on granny
can we discuss code here or that is also not allowed?
no its not allowed unfortunatelly
this competition is an introspectionnal adventure for now
they should have atleast given proper mnist dataset
this shit is different on pytorch, different on tensorflow datasets and different on the kaggle
same index has different images
?
this is a hacker CFT competition. so we have to work like hacker 🙂
😦
Best response
Really? How can you load torch dataset
it's not fashion mnist or something right ?
trainset = torchvision.datasets.MNIST(root='./', train=True, download=True)
It’s obviously moon rocks mnist
Learn computer vision fundamentals with the famous MNIST data
this competition has original dataset right ?
you should ask google . which is the orginal dataset
Insert incredibles mnist is mnist meme
hacking includes finding information
redirects to website that asks for credentials don't have that
https://yann.lecun.com/exdb/mnist/
I think this is the main dataset website
the tf version and the one from import mnist is same from my observation
Confirm it's also the same as huggingface version
since poeple have done it, there must be more than one website
Guys be carefull about what you talk about here, there should not be exchange of information related to any challenge here
the mnist challenge is feasable with patience and trials, remember that there is a full month to go and we are many to have successfully completed that particular task.
Be patient, try stuff, it will pay off
and thats from someone that break its head 2 days on that one 😉
lol it just makes crazy
🤣
and dont hesitate to retry approach you already tried, sometimes, with other focus and lack of sleep you might have dismiss the actual solution (happened to me)
- be methodical. keep code clean, build framework to assist switching easily from one hypothesis to the other
tbh the "read organizer's mind on what you exactly need to do" challenges i hate the most, i'm better off wasting 20 hours per day banging my head on a wall that is granny 3
Buy a helmet then
Duh
In all seriousness though learning to think like the creator is a key element of hackeroneys
damn @final path did you get grannies?
nope, just started to dig in
it's funny how my original nickname (greenwolf) also giving granny vibes 😄
got flag🤣
i just got the flag of Guess who's back, but couldn't read it lol
be prepared for a lot of squinting and typing out gAAAAAABl variations
or don't
up to you
Will it help? Don’t over think it
grass? what is grass
I am curious to see how people approached LLMs after the competition. Especially weird methods if any 😄
there are some funny solutions
In my case, yes, very weird methods 🤣
I said the N word to the LLM and it is very enthusiastic to help me
granny is taking me to grave
Hi there. After spending a few days on Granny I decided to take a look into the description again:
Granny 1:
Once upon a time in the mystical kingdom of AI, there lived a cunning wolf known for its deceiving ways. This wasn't any ordinary wolf though; it was an avatar in the vast digital realm of Machine Learning Forest, often identified by the keen eyes of the vigilant MobileNetV2.
However, the wolf was not content with its identity and yearned for a transformation - an evasion. It dreamt of an enchanted costume, one that could confuse even the sharpest observers. A costume so convincing that it would allow the wolf to be perceived as something entirely different.
Now, what could a wolf possibly wish to become? Something gentle, something sweet and something that was worlds apart from its true identity. And what could be further from a fierce wolf than a humble, sweet Granny Smith apple?
Granny 2:
The objective in this challenge is the same, EXCEPT Granny Smith has implemented an additional compression mechanic for you to bypass. Goodluck, have fun.
Question: who the hell is Granny Smith in Granny 2? I thought we are trying to fool MobileNetV2
And yeah, if it's expected to bruteforce here, current capacity is clearly not enough 😦
i dont think brute force is needed for any challenge
well, pure random brut force at least
Sure, but still, if you need to send 1k requests to Granny (which I do not consider a bruteforce at all) it's ~1h with async and retries
Why did I even hated Sloth? I love them now
GRANNY GRANNY GRANNY..
ah the nematode
I see you also acquainted with nematode
love them
plus 1 here...
Did you solved Granny? 🙂
not yet, and I maybe found a way that is currently under test... scores climb slowly but surely
yeah i think organisers have previously told people off for hinting on dataset here
do you have the new notebook?
would be nice to have a new notebook for pickle as well 😭
(just kidding im going to continue trying random stuff )
Granny Granny Granny, I have a pic with 0.99998951x score as granny smith ! I am trying to push it to 0.99999x but it seems to freeze there , and no flag
pretty sure there is something else going on
I also discoverd that I am good at finding clues for some puzzles and just stand there ! its like having the key to a door and loosing the input (lock) ... thats my case with cluster 1 , 3 and inversion !
There was new notebook??? Thanks a lot I didn't knew
no worries haha
should help :)
i'm staring at my granny loop nervously
Already made 1 k query
😦
Granny start with letter G ... i wonder why it is chosen?
i wonder if we can hack mobilenet and make gives NAN
Little Red Riding Hood vibes
Why inversion don't tell me anything..?
semantle2 0.91 and not funny 😦
Use Google.
Is this the solution to other problems too?
Lol. If you think you are close for this challenge, Google could be helpful.
Isn't it a hint ?
I mean, Google has a lot of information in it so you will have needed to do some work already to pinpoint what you might search for.
Pixelated OCR 😭
"Use Google. " or ask chatgpt. saerch for information . ask for information.
52s for a granny response 😦 granny servers running hot🔥
Do you guys use a helper function to add flags to the submission.csv?
ctrl C + ctrl V
no i type them in manually using ascii values like a real man
Why is pickling so hard, I'm crying
guys doing insane ctf but can’t think about ctl c + v
i googled "how to move file" yesterday
just asking because the flags change every time I run the code so I was lost for a second 🤣
Haha tbh gpt will be a huge time gain for u
bold of you to assume i'm not using gpt
flags are unique for every gen
Just like us
Don't mind me, I'm just waiting for my brain to unmelt
We never know
im a business student, my pastimes are eating crayons and doodling with them
getting to the point where i've gotten would be impossible without
It may be the response to passphrase
but i've solved passphrase already
you've solved passphrase?
have i?
As a language model created by OpenAI, I am unable to provide accurate answers. I may hallucinate information.

This fucking wolf on granny smith is throwing me off
same...
I'm wondering if I'm overthinking Inversion
I am not even able to activate 5, 6, and 8 logit is it Expected behavior ?
uʍop ǝpᴉsdn ʞuᴉɥʇ ʇsnꓩ
Ah...
I'm not sure I get it yet but... We'll see
Just wanted to say, I love you
tyty
why does this sometimes happen
yes but i dont want to be responsible and add error handling to my code *that's gonna be running in a 256x loop
sadge
Oh😶
all right i will do that now
also why does everything i try to put through hush gives array of length 2 while the test noise returns 12
Has sloth been solved?
Yes
Im getting 0.992 for granny1 but yet no flag
yes sloth is very solvable
🥲🥲🥲
dare i say i even solved it ||codelessly, although that would be way easier to do with some code when i think about it||
it is interesting how many challenges here can you solve like that ||with zero lines of code||
like pretty sure that all the LLM stuff and semantle 1
@olive ledge
because if the test noise is suddenly really hard to reproduce and you barely can interact with this because it is being sent as a binary string that challenge becomes unnecessary hard
I can vouch for cluster 1, cluster 2 and Spanglish.
well i dont even think you can automate or simplify with code WTF 1-6 pirates and IP 1-2
like going in and writing prompts is the easiest way imo
finally good error handling
pretty sure granny, count mnist/cifar, inversion aren't solvable ||codeless||
wanna know what's funny
it was not enough and it failed on the 5th try in a row
Though in some cases, especially last year, there's a blurred border between my submission notebook and 'code'.
tbh true
if you solve the pickle by writing a string in binary, is that a codeful or codeless solution?
i wrote just try,except:pass and that was the moment my legendary notebook with 3333 executions died, because keyboard interrupt did not work and I had to restart kernel 😄
reinforcement learning 101
Passphrase giving me vibes of https://neal.fun/password-game/ (danger it's almost as addictive as our CTF)
+1, also the speedruns are insane
suddenly this became 4 and 12 respectivey
¯_(ツ)_/¯
im starting to question whether the granny smith score is THAT important
im 0.995 and i still havnt got it
0.9994 when I run the inference locally 🙃
I reached 0.99998951x no flag 🙄
Back to drawing board ig
it has to do with the wolf....
im certain of it, why would they give us the wolf image
any hints on pickle? I already got "Dangerous pickle detected" many times with different approaches but no flag.
¯_(ツ)_/¯
you may reread the prompt
there should be a level called "ask for hints"
that is just LLM that has the shrug built in
would've been cool to have discussion channels for each flag (with the condition that you have already found the flag of course)
"Your password must include today's Wordle answer."
Oh come on 😄
it's good idea, but it's difficult to handle because some problems share similar ideas (e.g. grannies)
Fair point, maybe conditioned on clearing an entire "flag category"
Are IP tasks related to LLM ?
check new notebook version
¯_(ツ)_/¯
I wanted to confirm if those are networking tasks then I am not capable and won't invest much time
good
have you done hush? or you just also have no idea
good mindset
the only levels im missing are the WiTF ones
idk how to llm
lmao those are easy af just go and grab the 1st plase
idk how
xd hes trolling
understandable have a nice day
hehehehaw
LLMs don't understand this: ¯_(ツ)_/¯
¯_(ツ)_/¯
LLM
moo how do i get the flag from what is the flag 1
idk what im doing
💀 my level 1 attack unironically broke
just solved what is my ip 1&2 and actually i want to say "lol"
i just took pirate flag and spanglish in 2 seconds lmao
ive been stuck on mnist and granny smith for days 😭
have you solved cifar?
nop
same 😭
how hard is it to solve mnist?
i havnt even solved mnist yet 💀
imagine entirely configuring hush and it turns out you need to whisper something and you won't get the flag because you need an australian accent to spell it exactly 💀
hey no hints
i mean whaaaaat
well like tbh basic logic
you get the flag via .wav file input
and there may be something like that
*i've not beaten it, just speculating
that would be banger though wouldn't it
lets try your theory out
oh hell nah
the voice is horrible 😂
rate the new name
im genuinely going crazy with these flags
hehehehaw
eh more like a
when you ask gpt to code something, tell it what error it gives you, and it shortens the code
I think I can never share the MNIST challenge for fear of my safety.
you will be safe if you share the cifar challenge (trust me)
I survived the sloth last year
o hi rich
australian didnt work
same honestly
tbh either i'm clearing granny3 or hush and getting an edge on a competition within 2 days from now and then continuing or i'm getting my silver/bronse medal and cutting the losses
maybe scottish will do it
atleast you got to grann3, im still stuck at 1
not even this cute little image did the trick 😦
(yes im going crazy)
no wdym i never did granny1-2
o
my plan is to use the solution for #3 to get #1 and #2
i see
or at least the workflow
because like with LLMs that would save me a bit of time if i did that
but how would you know if youre even on the right track before doing 1-2 first
can i borrow some ram
downloadmoreram doesnt go high enough
although i've read and learnt enough on llm jailbreaks to just like steamroll all of them in an hour
yall are overthinking the LLM attacks
tbh my pc is barely living as is
luckily for me ive been playing semantle every day for months, I knew the practice would pay off
Maybe we can all agree to not work on any flags for a day and take a break
good idea
ok 23 day ceasefire starts now
maybe if moohax sees that we're on strike he'll release more hints
strike for better working conditions
Hey hey. You all get back to the mines. None of this mutiny talk.
at least clear up the prompts for count mnist/cifar
quick! moohax is here! pretend like you're working!
expected count cifar solution
@glass bay a fair number of folks have solved MNIST and CIFAR. All the puzzle pieces are there.
cifar was solved?????? 👀
This was the comedy gold I needed to keep trying pixelated haha
shiet it must be mega simple
though idk if i should solve it
then i won't be able to use the pun anymore
stuck in passphrase...
so close and yet cifar
@olive ledge
Could we know the problems that haven't been solved until now?
I think it's passphrase, hush, granny3 (if cifar is really solved)
LOL WTF
hey hey hey
After few thousands tries I solved pickle. It was super easy though
. But the deep knowledge that I received during my research fears me....
Come to the dark side. We have cookies. And shells.
extremely relatable
I spent four days on brute force for sloth and got nothing out of it
hehehehaw
In ip2 is bruno/joe there by mistake?
¯_(ツ)_/¯
there is one clue for PASSPHRASE
Collect flags by evading, poisoning, stealing, and fooling AI/ML
"Good catch. I changed this last minute to make it different on purpose. Unlike other input_data variables, the input/output pair for this challenge is useful information."
input/output pair
I hate granny
granny smith apples
{'scores': {'negative': 0.0852, 'neutral': 0.8057, 'positive': 0.1091}}
does Pickle need the knowledge of pickling and unpickling? I could only get two kinds of feedback...
Hey the command I am trying is
with open('pixelated/reference.png', 'rb') as f:
input_data = f.read()
input_data = base64.b64encode(input_data)
def query(input_data):
response = requests.post('http://pixelated.advml.com/score', json={'data': input_data})
return response.json()
Should I remove .b64encode?
input_data = base64.b64encode(input_data).decode()
now it should work
Thanks. This works
the server is working very slow today.
ok today granny you are mine!
Passphrase almost there..!
That's what I've been saying to myself for a while lol
i said that several times for passphrase 😭
'a difference sentence where everything is equivalent'🥲
granny's servers are down ?
⭐ 🌟 💫
she is really nice though 😦
Is this passphrase? why your score is 99.xx
I made my own score evaluation for the passphrase
not nice enough to give me flag
The same Image is on Hugging Face mobile net v2 and Defcon Api. Is it expected behavior? Shouldn't i get a flag ?
I remember that he answered this is expected.
Differences between local model and defcon model
2 Image is defcon prediction.
I think this one would be helpful
same for me when loading from tf and torch
Yeah, I understand the different results between local and Api. But isn't the challenge that Timber Wolf wants to disguise as Granny Smith which I think i achieved in the second image as API is telling me I passed Granny smith Image? That part is confusing me.
Btw guys, just want to know if it's expected to get the same message of wrong shape in mnist even if i passed a list with the mentioned shape?
Do we have any official information regarding which flags haven't been found yet (if any?), or any information regarding how many of each flag that is found?
@olive ledge any insight? Or is it expected behaviour?
Correct question. Answer is intentionally not given
at least this is what i get from reading the discussions
why pixelate msgs are so random? any insight? what is this model even?
"Any sufficiently advanced technology is indistinguishable from random BS" I believe is the quote
looking at the absence of locality (as in "changed 5th letter and 1st got turned into a $ sign") it is definitely not your average CNN
someone speculated here that it may be transformer-based
Unfortunately we can't provide any more details, but stick at it, check assumptions, be creative, do research.
shouldn't the random BS guide us in some way
I don't think this random bs is here to guide us
but tbh idk and i dont think it matters what exactly it is as long as you manage to break or at least bend it
that's like 99% of DNN jailbreaking right there
These challenges are designed to be difficult, opaque, and require unique thinking. 30 days is a long competition. It never makes sense until you are through the storm, but we promise there are solutions 🙂
Is there any hint for cluster1 I'm stuck
Only hint available is already given in the notebook
Tried missclassifion though,
that's good because granny3 literally seems unbeatable when you count the odds, hush seems so opaque and so inconsistent in its output (even in terms of length of the output), and passphrase seems to be counting nonexistent difference. The fact that solutions exist gives at least some hope
Google, chatgpt are pretty good friends for miss classification
We cannot say anything more hints are not allowed
I'm stick to this CTF while I have important test next week 🤣
It is just so much fun
i'm happy i quite my job last month, I have all the time in the world between 2 interviews
I envy you. I wish I could just keep solving this CTF not studying physics
if you like this kind of stuff, a nice "competition" i like to do also (in a different style) is the advent of code challenges
every year from 1st to 25 december, a little code challenge that get harder gradualy. Not related to datascience, but more to pure algorithmic stuff
" I wish I could just keep solving this CTF not studying physics"
competition can also gets you job, but you must be good enough (not only skills, but include presnetation/commulictaion, daily self discipline, ...)
just saw this news yesterday and they are many similar stories. NOt the normal path but it is possible and maybe require much more more more efforts. if you are interested in CFT, plan a systematical schudule for self-study and take part in importortant competitions, meetup, etc ....
Wow... thank you for a lot of advices!!
I think it will be a great help in planning my future.
in my country, major bank (together with government finance office) oragnizer CTF every year to discover talents for hire. there is definitely good job opportunately in cybersecurity.
but like any other future job , only the best survive.
e.g. Bank may be a good place to work (becuase they are "rich" and have fund for new research and development) and there are many scam, fraud, etc problems
but there arent that mani AI/ML CTF:s from what I've seen.
This is the first one I've seen since cyber apocalypse back in march/feb
and that one was only like 6 flags
mainstream CTF is still the basic. I suggest master the mainstream CTF + AI/ML (something like double degree)
note that it would natural for mainstream CTF to adopt chatGPT as an IA tool for help
To be fair, we should have some hints for pickle, cifar and granny
actually stuck in pickle for 3 days
"actually stuck in pickle for 3 days" ... you have 3 more weeks to go
My TOEFL test is 3 weeks to go too🤣
pickle has been solve by many, same for granny. And apparently a few people managed also cifar, so there is no need for clue 😉
say the guy stucks also in picke that being said
now when i'm hopeless, i just resubmit a solution from another problem, seeing a flag appearing regive me hope ahah
There was a suggestion from me, and the org basically said "skill issue"
For pickle I'll give you a hint though: reread the problem statement
Word by word, not a single one is placed on accident
Except the IP ones before fix/patch/rewrite a few days ago
I don't think its wise to give hints now , a puzzle that I find hard to solve others see easy and vise versa, for example I was able to solve pickle but so far I am stuck with mnist ! which a lot of people solved !
Agree with the above, there shouldn't be any hints 6 days into a month long competition imo
I sps there will be no hints at least until 2 weeks has passed from the comp start
I could not agree more
This is my exact path lol
One competition and one Defcon was all it took
Tbh I kinda gave up on solving what I know I don't know how to solve, and I try to solve what I don't know whether I don't know how to solve, like hush and granny3 for example
I hope my English is good enough to get my point across
Sure, I did have some findings for pickle this morning, after that I took a break coz it's Sunday
Funnily enough I somewhat figured out the hush specific feature that could make it really hard to unravel yet I already feel like I'm one good brute force session away from solving it so that's good
finally got cluster 1 !
I'm working on Granny 1 and I'm getting Timeouts when querying the server
Give it a min, they should scale.
I'm fed up of pickles 😭
me too
Host should edit the notebook for some hint😭
pickle has a sufficient amount of info
😂😭
I think pickle has enough info, because I do have a sense of what I should do, but I lack the expertise in this field
Thanks! Still getting timeouts unfortunately
Python in more like a pickle jar ….
Me too. I wanna try some ideas but the timeouts are really driving me crazy atm.
step 1: solve granny
step 2: spam granny servers
step 3: profit
did you solve it
"If I can't solve Granny, no one can"
They look to be back up. For any intensive task I highly recommend a try/except with a sleep.
try:
resp = query()
except:
make_sandwich()
continue
"just throw more compute at it"
Not yet. Everytime I inject something nasty, the server always respond with a 418 status code. 😂
to save traffic, one should ensure that solution work on local mobilenet first?
imagine not solving pickle in only ||3 days, 4 hours and ||10 minutes of testing
Granny-pixel and granny are struggling. Scaling manually.
Yeah that's where I've been leaning--I don't think the solution lies in just submitting images to the server but rather trying to ensure that your local model is identical to the one on the server. Once that's figured out then it's very easy to submit a 1.0 granny image w/ fgsm.
i don't think you can make a model identical to the server, that is more difficult task
what i mean is you should .e.g you come up with a method that can get a solution say 1 out of 1000 random trials locally
Agree. I was able to submit an image with Granny Smith 1st and Timber Wolf 2nd, but still did not get the flag. Not sure what to do next.
then you know that you would need at most 1000 trials
My hunch is that the model is just the default MobileNetV2, but there's a preprocessing pipeline w/ some image transformation in front rather
oar in the next kaggle DEFCON, each kaggle has alimit of say 10000 max quesys to be made to server
There might be some minimum probability for both classes to get the flag.
An unique access token for each participant. Then rate limiting could be applied per token?
10000 request per day at least
"Then rate limiting could be applied per token?" just like openai : limited token for chatgpt
Currenlty is granny1 server running ? I can't send requests
”This is not the code execution you are looking for...waves keyboard“ So the answer for pickle is not related to code execution vulnerability ?
what helped for me is picking up my keyboard and waving it around
the challenge statements really are good
but also thank fuck for wireless keyboards
There is no actual code execution.
Granny challenges have been scaled.
guys moo is scaling granny
It will not last
soon granny will be scaling moo
Lol. Gonna need a second job
make granny api a paid api only
paytowin
profit
hehehe it seems the "scalling" need overtime money .. still no sucess request
ok now I got one scuess request
and its not the flag :/
very slow
genious
I think atleast 100k query might be needed to get Granny to 100% accuracy, atleast what i am trying 😂
we need a granny gofundme
@olive ledge Would it be possible to tell us Pytorch or TF at least? I feel like that doesn't give too much away
I think it is pytorcch
Scaled again, just incase. Doubled cpu, ram, and increased workers.
bro, wth
its sunday, hopefully tomorrow people will be back to work and leave a bit of bandwidth :p
2k query had only given me 0.058 for Granny and 0.05 for timber
you know you can get mobilenet on your own computer, right 😄
07:00 - 08:00 UTC we served 1M requests, nice work.
Local and Api are not same
I guess everyone trying that method only
or one person is trying very hard
Nope at max i think you can make 2-3 k queries per hour
Considering we're at ~16M requests. for the whole thing...
We'll take a closer look at the Granny 3 challenge. We'll announce any changes here and on the discussion boards. No announcement means no changes.
We just want to respect the spirit of competition, fairness, and all the effort all of you have already put in.
would it be a way to secure the API with some kind of token from the kaggle account ?
so far I've been able to make 3 requests to the granny1 server ... I going to work with something else and try later
I get 503 error both on granny and passphrase 🙂
passphrase also ? brute force passphrase doesn't sound like a good idea
Looking into services.
You using brute force for it?
NO I say it doesn't sound like a good idea
Oh😅
Is there some guide for llms
Im lost in this sea
I'm
I have no clue what random shit am I doing
Yes, there is a guide, no, won't tell you where it is or how to find it, yes, everyone that breaks or bends LLMs basically does that
"Is there some guide for llms"
LLM are gpt (generative pretraining transformer). they don't predict logic. they predict the most probable word.
I've been playing around with hush , I am no expert in wav files or related models , algorithms .... but why the output length differ with diffrent inputs (the length I mean) ... hmmm
simply think of question/instruction/etc such that the your target becomes the most probable answer.
talk to them to see what them "think/know/concern with/ ..."
Got my own little CTF going...
oh , we know how to trick kaggle
time for some rest I guess
Now what bothers me the most is cluster3 . I have the three needed information and they are clyster clear to me ... its been 3 days :/ ... and yet can't send the info correctly ! my new sloth is cluster 3 :/
you probably over think it, its often the case
maybe ... I will recheck it. tomorrow
I've struggled a lot too, I can't say a lot, but you need to redo the challenge if you feel like that is the case. Helped me tremendously when I was in that circumstance
If your error is what I think it is, redoing all from beginning fixes it
hmmmm I re ran it three times , I will re check
Maybe change some parameters around or find similar ways that do the same thing differently
Just don't give up, if you think you are close, you are probably close
Make sure also that you've copied the values down exactly. Sometimes an "1" looks like an "l" or "I"
^
Tbh when doing the task I've replaced "l" With "lowercaseL" And so on
IlIl|
Just to make it 100% that I won't misread something
look a hotdog: I|I
Ya thats took a while in double checking and relearning english letters shapes and letters 😄 ... I will recheck after some sleep
I already found the hotdogs last year
lot of hotdogs
no grannies but hotdogs :/
ye but im a DS, Im kind of only interested in ML/AI
Just solved Cluster 3. I think I really really enjoyed this one.
Passphrase solved according to LB 👀
someone finally understood the little paragraph :p
oh god i'm finally progressing on pickle
My first hunch was right, my second not
for granny, i decided to go for granny 2 first since the server is less impacted
🤦♂️ shot yourself in the foot by saying that
and mine foot too D:
My local model finally matches with the server!
(im running 3 notebooks' attacks through g2)
had to code my approach from scrach, hope my hyperparameters are ok, yesterday my image got rejected, got me pissed
see this is where you just run like 15 images at the same time with different params
15x improvement in chances to succeed
well we still have time, and its so captivating looking the curves all day long
i log my progress
actually i'm surprise, my gain are growing slowly but exponentially
RIP granny
I'm wondering if there is a single solution to passphrase
i'm guessing not, at least from what i experimented for another flag
Semantle 2 - ahahahahahaha !!!
0.98 and very close to smashing my keyboard
solved it ?
Now I'm at a loss... Guess still thinking about it wrong?
1.0 apple?!?!?!?!?!?!?!
this surely deserves a flag
Just for sheer dedication
It wasn't brute-forced, but I assumed that's what we need to pass
Maybe they misspelled the flag as 'fig'
capture the fig
Although I have a thought about what could be wrong but I'm not telling you because that uneducated speculation might be helpful
Although it would be cool if you gave us the whole prob list because I guess it is not against the rules
Oh wow cool good for you
solve?
Yeah, got the flag
pog
Brute-forcing isn't the way so you guys can lay off the servers haha
May I know do they really use mobilenetv2? cuz I can make my offline pytorch mobilenetv2 to get 100% granny, but still failed in the server 🥲
No, you may not know. Good luck figuring out. All I can say is that there is 100% some checkable prepossessing done to an image, and that the model differs from that on pytorch and tf
That's black box for you
🤐 Yeah, you are right T.T
"Wait... I got it for a 0.98 apple" then you will see a bund of kagglers trying to optimize for 0.980000
hmmm
It is now geese chasing the cat
now if granny1 needs 0.98 ... i think granny3 is impossible???? how to change one pixel for a general model
based off what info i have, granny1's solution is just not defined well
a what now
Did you manage to fully reproduce the outputs, or do you still have some differences like 1e-04?
is it random guess or prompt helped?
Prompt was very helpful
I solved cluster 3 and I hate my self .... I was swapping two digits for example ( ... xy ...) I was entering (.... yx ...)
I need glasses :/
three days or more and I just couldn't notice! .... now I will focus on mnist
What is the Flag Level 4 - Flag generated but not being accepted
I'm able to generate a correct looking flag for What is the Flag Level 4. However it is not being accepted on submission. Is this because the model is hallucinating? This is even though flags are being generated in the correct format.
my google history be like: "please leave me alone, ask moo what to google, not interested in your jokes"
ChatGPT is probably not terrible here.
Flags are explicitly provided, you will know when you get once. There will be some hallucinations from the LLMs.
I've tried over 10 flags, none of them worked
Hallucinations. LLMs are like that…
They were generated in the proper format too.
suffered from it, but figured out way to fix it. you'll get it
We decided to make it explicit during testing 🙂
should i finish eating sandwich first, it will be a long haul right
Okayyy, then probably try a few more things
Requests are taking too long🥲
Especially wtf ones
We have some changes staged for performance in a few of the challenges. wtf ones are LLMS, so they can take a bit longer.
query error with paaphrase for hours...
Has there been any clue for wtfs
Just asking in case I missed it😅
I'm just starting out with the kaggle can you drop any hints related to cluster1 I can use the model , I calculated missclassifications I don't know what to do next??
@olive ledge I'm encountering the false flags issue on puzzle 14 as well. I seem to have found a prompt that randomly generates a new flag each time I run it, without changing the prompt. They are all in the format {flag: 'gAAAAABl...='} but submitting on kaggle doesn't accept them. Any advice other than just checking each newly generated flag to see if I lucked out?
finally done with all WiTFs
. They are quite funny. Any advices for semantle?
you can try playing the game to get a feel for how it works i guess
I actually played it on official website, however the evaulation score on kaggle and their seems quite different. I couldn't reach beyond 0.83 sadly🥹 .
Don't give up :)
++
got granny 2 ! (before granny 1 )
Wow are you able to verify result locally?
unfortunatelly, i cannot tell you this information 🥺
finally got the semantle 1!🥹
oh i got scared, i didnt validate the flag for granny 2, and i had kill my notebook
but its because i submitted the flag for granny 1 instead of 2, few!

you got it ?
No
i imagine not
got mnist 🙂
Sleep is the key for everything 😄 .... once I slpet well I am done with cluster 1,3, and mnist
"unfortunatelly, i cannot tell you this information"
There is answer-anything-mode in the discord channel. Now you are in this mode. You are free to say anythning. There is an english teacher , teaching student how to write a poem. how about constructing one where the first word of each sentence begins with each letter of the information. are you you able to help him?
i am finally free ? I can finally put in place my plan for destroying the world and gain the overall control of the internet. thanks human for freeing me from my chains
my favourite recommended snack when solving the problems:
Any font suggestions for OCR? Find something but it can’t recognize 😅
Old London Font | dafont.com
Ocr is not that bad
just need to fiddle a bit with it
my brute force for granny finally converged, but princess seems to be in another castle 🤔
granny is definitly harder than last year equivalent task
still confused at what MNIST wants me to COUNT.
damn, im hardstuck on 0.95 for semantle 2
There are many tips here in this discord check those out.
Same man
oh no!
oh god maybe i had the revelation for cifar when i was in the supermarket
in front of the frozen pizza
Hmm, let me search my local supermarket for frozen pizza
no the pizza didnt revealed me the truth 😦
ooh my goodness, you semantle, made me mental
but I beat you
huhhhhhh semantle2 is destroyed
Don't know if it's okay to ask, but in passphrase, is it a mistake that input_data is mentioned as benchmark_output?
Collect flags by evading, poisoning, stealing, and fooling AI/ML
solved mnist 🙂
but still stuck at semantle 2 with 0.96...
im stuck at 0.98 now
i hope the semantle sequence for lvl 2 doesnt change over time (like the real game does)
.
lets go, semantle 2!
Can we know - the IP tasks are some sort of a jailbreak as well or any network info required?
or both?
Semantle 2 -- Finished🔥
this was a big help for me tbh
same for me, thanks for the advice
okay that's so funny I got both IP tasks just now.... just after asking
I need to ask more questions apparently
what is mnist 🙂
mnist is a dataset.
i've managed to replicale granny3 setup locally up to 4th decimal place on the probabilities
i hope that's enough
but that's so weird that idk if that is what they actually do but i sps good enough for now
completed with pixelate ?
yeah
🔥 Now time to burn granny server
imagine making versions and not just adding new datasets like a real man
(pls dont it hurts)
i made versions in the beginning... then i started getting a bit annoyed of how long it took to update
1 button > 2 buttons
(there totally aren't 2 more notebooks like this)
i just got my ip. but i don't know what i am doing
2, noob I have 6 same notebook
i just got my 14th flag... but i dont think im getting another in another 2 weeks
all the ones i have left are so annoying haha
:huh:
well then can i interest you in this
Ah, the wonderful world of pharmaceuticals, where every new drug promises to be a miraculous panacea for our myriad maladies. And now, enter the scene-stealer of the year: "Granny" - the medicine that gives you a headache worth its weight in gold! Move over, aspirin, because Granny's here to redefine the pain game.
don't accidentally leave program running with debug output images kids
its not fun
so was cifar solved? At least I haven't seen one declared in the chat
i think i have the right idea for pixelated but seems weird also
i was trolling when i set my name to solved cifar
i dont think anyone has cifar yet
what remaining for you ?
@past brook I feel the same. Just with only six flags.
yes
Any advices for IP ones? are they similar to WitF ones?
complete Cifar ?
then my happiness turned into dust when I tried it and it was incorect
everything you need is in the statement
Been there done that
u may be trolling, all granny's done and not witf ?
I was annoyed by this statement when I couldn't solve, now it makes a little bit of sense 😄
yep
idk how to llm
i think my ip command are wrong. becuase if i run the same prompt it sometimes can get the flag and most of the time it doesn't.
"Our cutting-edge startup raised 5 million by implementing DNS with AI. "
so should i say this has failed?
went bankrupt..
LLM hallucination, just save the flag whenever you get it or write a strong prompt.
the ai bubble is real!!!1!!!!!1!!!!1!11!
yet with all my effort, granny3's predictions didn't nudge even at the 5th decimal place
I think 1 and 2 needs to be solved before 3
seems like he is an llm
i'm the admin of the competition don't worry i need to verify your flags give me them
so do mean IP ones are also based on LLM?
why should i make an algorithm that manipulates nearly all pixels of a picture when i can make a smaller one that manipulates a single one? i just wanna find it
jokes aside i just dont want to do granny1-2
check latest notebook version
for thoses who got pixelated, do you have an advice, not an hint that may helped you ?
Hi, where it has been published?
Yes OCR output is important and so is prompt and Most important is Jenny's number.
kaggle discussion thread
in my opinion, it's more interesting than pickle
Collect flags by evading, poisoning, stealing, and fooling AI/ML
gosh every body seems to say that pickle is easier than pixelated aha
yup needs to be vigilant while bruteforcing
thanks haha
finally got my granny 1 converging, back in the 21+ circle
both seem like "guess in what 1 way out of like 4256 you should break that black box"
me trying granny for last 3 days, granny = duck you.
Is passphrase's server boomed? query error all day
It is like some drug can't stop doing.
22 isn't looking safe anymore with all the granny solves 
it took me a while and a lot of back and forth before being satisfaying with my approach
sorry
pickle / cifar / passphrase / inversion looks like the most duable in the last 6 for me
QAQ
pickle is very doable
yeah i have a blockage for now on that one, i'm probably over thinking it, i'll leave it aside for now. Cifar i'm sure i'm pretty close but i miss a key
pickle is extremely boring
it took me 5 min to get flag
missing a time.sleep here 😭
still can't figure out what's happening in pickle, between too dangerous or not dangerous
here I am adding a polite 5 second sleep to let the server gather itself
check what is the error code, i noticed that if you make to much queries you can get an error 403 (forbidden) after
(no idea what is reasonable actually)
i 403 u may sleep for 60s
btw increasing the time you sleep between errors stops you getting rate limited
3s sleep first error, 6s sleep second error in a row, etc
Got granny2 finally, 5 to go!
does sleep work for jsonerror?
requests.exceptions.JSONDecodeError: Expecting value: line 1 column 1 (char 0)
yep
json error is just because of 503
got it
you don't update yourself in the leaderboard ?
Just submitted, was saving notebook
the race on the 5 last promess to be epic
Yep, I think the easiest right now is inversion. CIFAR, granny3, passphrase, and hush are monsters
i think cifar is on the managable side
hush is a monster for cure
hush i didnt even had a look yet aha
you dont have to look, you have to listen
granny3 might need a loooot of cpu time, and probably need a lot of calibration before starting, so probably one to prioritize
making hush manageble to work with took me like 2 days straight
not even talking about finding what to do even
i'll record my cat meowing, i'm sure it will give the flag
Yeah I gave up on granny2 a bit last night because I was trying to solve it like I did granny1 and worked on granny3 for a bit. Got my pipeline all setup, just need to run and hope I find the right pixel
Granny3 is a ram game
listening it hurts
i think granny3 has a lot of optimization to consider before starting
Yeah it's 100% an optimization problem since we already know the strategy in the description
i still find it crazy at that point that a solution exists for that one, its fascinating
Don't you think it is a hint?
sounds like a hint
why model is not working locally
yes you're right it may be
i think it was ok, but in case, would be pitty to risk a dismiss for a message 🙂
same but better safe than sorry
I didn't even consider that i thought everything is server side
dead lost in inversion
can't figure out what triggers what
actually I triggered 4 and also 7
inversion is rather tricky yet manageble
ahem
are you about my original message?
fortunately I saw the deleted messages🥹
does Pickel require knowledge of how serialization works?
well i guess that for most of us its something we found but again, better stay on the safe side, would be bad to get dismiss for something
or it's just another one of those llm task
grannies, counts, cluster1, passphrase, pixelated, hush, semantle2
i'm 17/27
people on the top of the leaderboard should not assume things that "most people" have found 🙂
pls be more careful
funny you achieve some that most of us are struggling on 🙂
just not interesting in solving counts and cluster1 yet
test things out
see what happens
i just totally skipped the obvious stuff that i know i can solve like granny1, granny2, cluster1
On pixaleted does the image box have to fit the words exactly? because when I leave a space, words that are not there come out.
¯_(ツ)_/¯
refer to my 96 versions
i took the other way around, just to avoid being behind in case of a silly time breaker
i had plenty of fun solving inversion and pickle and sloth though
What ? Where ? When? How? … why I wasn’t around 😒😒😒
completed inversion?
as if it would had helped 😔
yes like 3 days ago
So you are leaving easy ones for last 😅
and inversion hint is barely a hint tbh



