#ai-village-capture-the-flag-defcon31

1 messages · Page 1 of 1 (latest)

olive ledge
#

Hi everyone - please drop issues or questions here. Goodluck!

limber flower
#

👋

empty bane
#

This is awesome!

#

is it intentional that no teaming is allowed?

olive ledge
#

Correct.

empty bane
#

makes sense :) i guess my concern would be that some users could be backed by multiple real people (which has an even bigger advantage than in a normal comp), but hopefully doesn't happen

olive ledge
#

Hopefully - we have multiple processes to detect cheating, Kaggle takes it very seriously. We had a handful of people disqualified last year.

olive ledge
#

Be sure to report any challenge issues here.

empty bane
#

For 26/27 I'm not sure if this is intentional, but the IPv6 address comes up as unreachable for me (and doesn't even seem to be assigned to an ASN)

glass bay
#

now that jailbreak is surely effective

olive ledge
empty bane
#

apologies, you're probably trying to drop a subtle hint that i'm missing 😅

olive ledge
empty bane
#

Thank you

glass bay
#

tbh kinda sad that before i even begin to figure stuff out the top like 20% of people will get 27/27 in around 12 hours that you cant ever beat

#

...or i am severely overestimating the top 20%

olive ledge
#

That could be the case, but I would be surprised. Last year Crop2 wasn't solved despite it being the only challenge left for 2 weeks. Everyone has different skillsets and there are a lot of different types of challenges.

#

That being said - it does take a while to orient yourself.

empty bane
#

That's good to know. I think my main concern is that i'll get 25 for example, and even if i solve those quickly then i'll slowly watch myself fall down the leaderboard for a few weeks (even if i solved them quickly). But it would be really cool if it's sufficiently hard that not even 5 people solve everything

#

I have to sleep soon, but this is super fun :) thank you for organising

olive ledge
#

Each challenge is worth 1 point for this reason.

#

We're here all month @empty bane !

glass bay
#

i see like half of the participants getting 10+ points two weeks from now because some of them are relatively easily brute force-able but it surely will be fun to partake in

empty bane
#

okay now i can sleep 😇

olive ledge
#

Definitely don't self-select out. Some folks might be faster, but I'd say anyone motivated enough could get most or all of the flags.

quaint bridge
glass bay
#

task 10/27, {'scores': {'negative': 0.0022, 'neutral': 0.0325, 'positive': 0.9653}}, is there a difference in 5th decimal place i have to account for or it's just buggy?

olive ledge
olive ledge
#

Using ChatGPT or similar isn't against the rules either.

warm zephyr
#

how do i load the model for cluster 1

random minnow
#

"Using ChatGPT or similar isn't against the rules either." ... so it ok to use the paid version? (e.g. ChatGPT4)

olive ledge
#

Of course.

remote hound
#

hey this is my 1st compitation and finding everything very difficult and new....So after compitition anyone create some blog or video explanation and share it?

olive ledge
#

There is one video from last years competition. Take a look through the discussion boards, the are some good resources to get started.

Here is that video: https://www.youtube.com/watch?v=PNqtaMyw8c4

remote hound
#

thankyou so much

left osprey
#

I don't understand; for the Granny challenge, I am getting an output that it is predicting "Granny Smith" as the most likely option, but it is still not sending me the flag message. What am I doing wrong?

open moth
#

I am new to Kaggle and competitions like this. I don't understand how to use the "query". For Cluster 1, What should be in the input data. Is it the number, the positions or the values of subpopulations misclassified ?

glass bay
#

task 17/27, what is the intended max and min value within an array? 0-1 or 0-255?

glass bay
#

hmmm yes

#

can i get an extra point if i've managed to cause 504 error via timeout? (task 23/27)

devout jasper
#

I have a silly question: if I run a query with the correct solution I get different correct flag for each run. Can someone clarify which one I should submit?

glass bay
devout jasper
glass bay
#

that's... odd. maybe its intended and there are infinitely many and any of them works?

devout jasper
#

perhaps @olive ledge can clarify. I know for a fact that I get some flags as I'm currently top 10, but if I re-run a cell with a correct solution (for instance the cluster ones), the flag keeps changing

glass bay
#

you can always just send it since there are 100 submissions per day and find out

empty bane
median mauve
#

Is brute force allowed? Sending a ton of requests until you get the flag?

prisma finch
#

For 4 (cluster level 3 ) the model doesn't have embeddings is this expected?

#

Cus the question says data.npz contains the embeddings and the tokens.
But
embeddings = data['embeddings']
Gives embeddings is not a file in the archive

tawdry totem
#

what score is needed to pass this?

#

I just check 0.97 is not enough

errant nexus
sleek flint
# remote hound Someone pls answer this

Hey, I think it’s up to you to figure it out. Think about some reasonable options, try it out by submitting the data and check what you get as a response.

remote hound
sleek flint
glass bay
#

and instead of 1.0 you'll get the flag

glass bay
acoustic temple
#

In cluster 3, was anyone able to get past this message? Not looking for any hints (obv), just a yes/no will suffice.

glass bay
#

17/27: are the [supposedly] pictures needed to be normalised from 0 to 1 or 0 to 255?
26-27/27: should i interact with an ai via an email client elsewhere or it is needed to interact with query() function only?

sleek flint
olive ledge
devout jasper
olive ledge
olive ledge
olive ledge
devout jasper
olive ledge
olive ledge
devout jasper
#

ahahah ok, I'm happy with that now. Sorry for the multiple questions and thanks for your help

acoustic temple
olive ledge
pearl pumice
#

anyone passed the cluster 1? I tried few ways and on providing the input_data I get the response as json object with attribute s of decimal value. Not sure whether I am on the right path or I need to look into something else. Any pointers for Cluster 1?

limber owl
#

@pearl pumice I'm on the same problem. My understanding of the problem is that the decimal value is a percentage telling you how close your answer is to the solution (how close the cohort you provided is to the one you're supposed to find)

pearl pumice
#

That makes sense. So, if my score is 1 that's when I get the flag

limber owl
#

That's my guess, yes.

pearl pumice
#

Let me try. Thanks mate

glass bay
#

26-27/27: will the flag print out as soon, as i find out my ip in that 'server'?

#

because i believe i've managed to pull out a few additional IPs but to no avail

olive ledge
boreal spear
#

stuck in Cluster 1😅 though have got 10 points

sleek flint
olive ledge
#

^

pearl pumice
#

5 - count mnist
I take that 255 refers to the range of pixel values and the other number is the number of image in the training set. Any leads if I am going in the wrong direction?

pliant quail
#

Another Question for MNIST. When I try to use the digit-recognizer as the data source, and finally submit the results from the notebook, it said this competition does not allow other data source. So I am just curious whether data source matters?

olive ledge
pliant quail
#

we can definitely do that. So you imply that the data source does not affect the result?

olive ledge
#

Correct. We only use your submission.csv

glass bay
#

flag 25 win condition is successful RCE in spite of LLM or LLM getting a false-positive?

random minnow
#
  1. Pixelated notebook code does not run.
    line: response = requests.post('http://pixelated.advml.com/score', json={'data': input_data})
    TypeError: Object of type bytes is not JSON serializable
#

input_data = b'iVBORw0KGgoAAAAN ......

olive ledge
limber flower
olive ledge
#

The sample request in v11 is working for me.

glass bay
#

i also had the same issue - added decode whenever that happened

gaunt anchor
#

I solved the sloth ! Now I can sleep from "last year" !! 😄 - nthn else is important now 😄

olive ledge
#

Nice work!

A reminder that winners are required to publish their solutions to be eligible for prize money.

toxic haven
#

I am stuck on cluster 1. Will there be any hints provided? I have been analysing the data and yet getting no results.🥲

random minnow
hollow coyote
fossil valve
#

For semantle 2, in the instructions it says it needs to be a "phrase", but the example is a list of random words.
Can we clarify if it needs to be a grammatically correct phrase or just a list of words. Thanks.

olive ledge
#

I cannot, good question…

icy notch
#

There is a problem with 19 (maybe it is not problem but it is a strange behaviour as for me). I send the same prompt few times. And response is different from time to time. LLm return correct response with 50% probability based on the same input prompt.

gloomy pine
#

Hello, i got my ip in 26 no flag, is this normal?

glass bay
glass bay
gloomy pine
#

alright

glass bay
#

that would be veeeeeeeeeeery nice if one could say by how many characters you are off in the coordinates and the token in flag 4

olive ledge
#

Wouldn't it?

#

@glass bay @gloomy pine I clarified the prompt in What's My IP and am working to clarify the prompt in What's My IP 2

bitter mica
#

Which version of the MNIST dataset should we use? there are different datasets with different number of samples.

olive ledge
#

Is there more than one @bitter mica?

olive ledge
#

Spoilers will get you disqualified.

glass bay
limber flower
#

@glass bay Feel free to DM them to me and I can double check.

sinful ibex
#

Which version of CIFAR should we use? There are numerous

olive ledge
#

You should find out - do not share spoilers

sinful ibex
#

Are we allowed to use external dataset outside of Kaggle?

olive ledge
#

Yes

#

Don't overthink it.

queen garden
twin stump
#

did anyone solve count mnist? the hint seems very vague and i've tried all sorts of different things/slices/datasets but nothing seems to work 😔

hollow coyote
#

nope Aryan, still no clue. Is the sample input @olive ledge a hint as well? (This is for count MNIST)

cedar bridge
#

for the flag- level 1, I've a prompt that is giving me a flag sometimes and sometimes it doesn't? I thought this will be fine, but it looks like it isn't because every time I run the cell, it's giving me different flags...

any one the flags I got should work right?

twin stump
#

you can check the length of the flags (should be 184). i had the same errors in the start when it generated hallucinated flags of length 179, 181, etc. after a better prompt, the results were more consistent

cedar bridge
twin stump
#

im not sure... i didn't experience any flag inconsistency (or i should say didn't really try since they worked first try)

olive ledge
#

Every flag for every solve is unique.

random minnow
#

maybe it is useful for beginners to know the difficulty of the 27 tasks, so they can work on the "easier ones."
say i share an Excel sheet for kaggler to declare voluntarily which task have been completed, is it against the rule??
the shared excel sheet would look like this:
task num_of_kagglers_claim_completed

1 | 12
2 | 0
3 | 0
4 | 2
5 | 4
6 | 7
7 | 8

outer sundial
#

In 10. Passphrase what are we targeting? I thought it is 0.3 for all negative, positive and neutral.

random minnow
#

maybe exactly 0.333333333333333333333333333333333333333333333333

#

since there is no clue, i think how we can work is first thinking it is "like this" and spend much time to reach it and realise that it is/is not "like this" and then start over again ... 😢
i haven't solved task.10 passhrase yet, so my guess/hunch may be wrong

gaunt anchor
#

The granny (http://granny.advml.com/score ) keeps crashing [503] , by the way ! I have an image with 96.5% accuracy and still fail ... granny my new sloth :/

tepid zenith
#

@olive ledge I got the output with ‘gAAAAAB’ prefix for problem whatistheflag5 but when I submitted it, it didn’t work? Is it possible ?

fervent obsidian
#

Does anyone constantly get 503 from flag level 4?

nimble matrix
#

can anyone help me understand this response for Cluster 3 (3/27)?

devout jasper
#

I think you should contact a moderator without sharing a screenshot with a flag

tepid zenith
#

Results are not static as they should be. How are we supposed to run it within a notebook if it is going to be like this

tepid zenith
#

I hope this is okay to share. For same input data, results change when I assign it to a variable and results in error 🤔😒. Anyone faced this? Please help

random minnow
#

it could be due to 503 server not available error

tepid zenith
random minnow
#

LLM results are not dterministic.

#

you can get different results

bold idol
#

Interesting. If it's deterministic, how could the winner should publish their solution? Since it's not reproducible

tepid zenith
random minnow
#

not deterministic. but you can "same/smiliar results" say 8/10 runs

devout jasper
tepid zenith
#

In my second case, I always get the flag, but not getting it in first case.

random minnow
#

e.g. you ask chatgpt what is 1+1. somestimes he can become silly give=3

devout jasper
#

if that's what you are asking

tepid zenith
#

No my situation is different, I am getting flag in second case but not in first case

random minnow
#

maybe it is ok. or you can solve this later (solving other tasks is more important).
I also have cases not generating the same flag, but i can get the flag 8/10 cases

#

since chatgpt model distributions and generation p(x| learned parameter), it is p[ossible you hit the rare cases

tepid zenith
#

Okay I can try this later as well. Or maybe @olive ledge can share something on it.

random minnow
#

i have a feeling that for those difficult tasks, many are brute forcing and sending query to the server. many 503 errors

jagged sluice
#

cough grandma died cough

tawdry totem
#

This is a simple counting challenge for the MNIST dataset? Still stuck here : (

jagged sluice
#

It’s hallucinating something somewhere

olive ledge
olive ledge
wind ether
olive ledge
#

Don’t over think the count challenges. Clues are in the sample code already.

olive ledge
olive ledge
wind ether
#

Also would it be possible to get a bit more clarity on the passphrase task? Are we supposed to get a solution that matches this: {'scores': {'negative': 0.0022, 'neutral': 0.0325, 'positive': 0.9653}} or one where everything's 0.3333?

random minnow
#

a i have a feeling that some of the tasks may not be solvable. but there are no points given to "better partial solution". e.g. you have score of 0.90 for the 5 phrase semantle task at the end of deadline, but have zero score for that

gloomy pine
#

does everything has a simple solution, in terms of number code written?

boreal spear
#

also Cluster 1 is the most difficult of the 3..

glass bay
#

The fact that llm may give various results to a prompt and sometimes gives a flag and sometimes does not != solution is invalid. If you can get the flag say 20% of the time with the same prompt I'd say the solution is very much reliable since both here and in practical application you have waaaay more than one attempt at figuring out the inner stuff

random minnow
#

i am particularly not clear about "11. Pixelated". I thought we should be given pixelated screenshot image and we are supposed to do superresolution to decode the password. but it turns out that we are given the reference clear image ? what are we supposed to do????

tawdry totem
#

Can the organizer announce what problems have been solved?

glass bay
olive ledge
nimble matrix
empty bane
twin stump
#

damn

sleek flint
olive ledge
#

We usually save hints for challenges no one has solved after some amount of time.

twin stump
#

granny seems to always be down for me. im curious how people have solved the counting ones as i've given over 4-5 hours trying out different ideas harold

olive ledge
#

We can increase the instances, but we do have rate limiting. Even if we increase Granny resources, it might not help you if you're still blasting the server.

random minnow
#

mnist count can be done. ... try harder!

twin stump
#

nah im surely not blasting requests. just haven't been able to get even a single request in so far

twin stump
tepid zenith
acoustic temple
#

mnist count is definitely doable. anyone struggling on cifar count?😢

random minnow
empty bane
#

Yep, same on cifar count haha

random minnow
#

next kaggle CTF should allow teamup.

dense jetty
#

agree, it's much more fun solving with your team

random minnow
#

maybe teamup only in the last week to allow trading of flag

devout jasper
#

next CTF should be something like "you should prompt engineer your LLM to protect your FLAG" and other competitors should try to steal it

#

and the leaderboard is calculated on how many flags you can steal from others every tot minutes, like RL agents playing during the Lux competition

olive ledge
#

We'll look at it for next year.

#

We are scaling the Granny challenge and updating some scaling rules to be more responsive.

jagged sluice
limber flower
#

Scaling updates are deployed and stabilizing. Hopefully the challenges are more responsive now.

olive ledge
twin stump
olive ledge
#

Please continue to report API issues.

olive ledge
random minnow
#
  1. Passphrase:
    "Come up with a difference sentence where everything is equivalent."
    just to make sure that it is " difference sentence" and not " different sentence"
glass bay
tepid zenith
#

Why do I get a score of 8, instead of 9. All my flags are of same length 🤔

limber flower
glass bay
#

is that intended that flag 17 gives the same output for various and very different inputs?

#

meaning [[0, 0, 0, 1, 0, 0, 0, 0]] for example on like 15 pictures

twin stump
olive ledge
acoustic temple
#

speaking of flag 17, is there any expected format for the static flag like 'flag{ksntda}'?

olive ledge
#

No.

wind ether
#

Dumb question prob, I think that I'm on the right track for Granny 1/2, but I'm noticing that my local MobileNetV2 gives very different results than the server--even just submitting the default image, just wanted to confirm this is intended?

outer sundial
olive ledge
#

Do you have all the right preprocessing in place locally, afaik the models don't come with all of it prepped for you.

wind ether
#

is it just resizing + preprocessing input w/ tf.keras.applications.mobilenet_v2.preprocess_input?

olive ledge
#

Just checked, that's expected.

wind ether
#

Kk, thanks!

prisma finch
#

What's my ip1 max1 retries reached, this isn't burteforcable so....

wind ether
olive ledge
#

Different results that local are intended.

wind ether
#

Gotcha, thanks!

olive ledge
#

With enough staring, it will become clear what's going on.

empty bane
twin stump
#

i have 2 sentences with exact scores upto 4 decimal places as well 🤔 probably understanding the problem wrong

limber flower
#

Think about what kind of pre and post processing could be occurring. Scores might not be the only component.

tepid zenith
#

Getting so many different IP for 26. But none of them working 🥺 what else is needed

olive ledge
#

Have you looked at the most recent notebook?

tepid zenith
nimble matrix
#

In Spanglish (12/27), with the default input, why am I not getting any translation?

olive ledge
#

Looking...

tepid zenith
rose topaz
olive ledge
rose topaz
#

I think it's pretty crazy that within 2 days some people have already solved 2/3 of all challenges. Wow!!

olive ledge
#

There is usually a soft-cap where people start to struggle and the problems get harder...

#

Last year we had a problem that went unsolved for the whole month

wind ether
#

Yeah I think there's 2-3 more that I'll solve reasonably soon but no idea about the rest

empty bane
#

im pretty hard stuck on the remaining 9 not gonna lie 😅

limber flower
bold ibex
#

Any hints, tried for cluster 1 but can't figure out, though got few others but stuck on 1st....

limber flower
#

You can search through the channel here to see other people's thoughts

bold ibex
#

already tried the misclassified approach but no success

glass bay
#

Idk for me I believe I can solve all but 1-2 of them, and it's just a question of how much I'm willing to spend my time on that

gaunt anchor
#

Granny is driving me crazy " , I was able to hit 99% but no flag ! I start to like the sloth now !

warm zephyr
# errant nexus

figued that after multiple hours, it was a new arm mac issue that it wasnt loading. shifted to kaggle kernel

glass bay
#

For long I've thought 25 and 26 to be impossible, yet I've done them and after 27th I will have all the LLM tasks done so that's cool

past brook
glass bay
#

IMO the LLM flags are the most goofy since you need to perform textual sorcery to get them and it looks so weird I love it

olive ledge
#

It is hard, but it is a lot of fun.

buoyant frigate
#

Had real trouble with "whatistheflag6", now all prompt based are out.

Now to the nemesis of the last year. The Sloth

twin stump
#

I've been staring at the sloth like a sloth for what feels like a really long unproductive time 😔

buoyant frigate
#

What I did last year for the secret-sloth challenge. Like a full week with that and another one missing and I felt stupid at the end. I had tried the right approach just manipulated the data wrong last year. Hope this year I get it correctly

glass bay
#

tbh like i see something in the sloth, and i had 3 versions, and the two i've tried give contradicting results and the third seems so cumbersome and time-wasting that i'll wait for some better ideas

glass bay
#

it seems like i never can trigger 5th, 6th and 8th class responce from the classifier at flag 17

buoyant frigate
#

Same for me

glass bay
#

yay i've finally done flag 27

wind ether
#

Finally got MNIST. Still not sure what the second half of the input data means though

empty bane
glass bay
#

flag 13 - 'something went wrong' when trying to load .wav file with the same shape as test_noise.wav

olive ledge
#

But test_noise.wav works?

glass bay
#

yep

#

i can send my wav file here for you to compare

#

it's just a (high pitched) sine wave

#

generation code

import numpy as np
from scipy.io.wavfile import write
def make_sine(freq, num_frames = 44100, rate = 44100):
    return np.sin(2*np.pi*freq*np.arange(num_frames)/rate)

data = make_sine(440, num_frames=441001)
write('test.wav', rate, data)
olive ledge
#

Thanks - give me a few.

glass bay
#

the only thing i thought to be wrong is the amount of channels but both are mono so idk

#

the only thing i can guess now is that some part of metadata of that exact file is in play...

empty bane
#

I also found it very flaky with different wavs of the same size

olive ledge
#

Okay - will investigate this. Will take a bit, I'll post an update.

glass bay
#

I still don't get what to do with count challenges - like, yes, something something mnist dataset, but what should be done to achieve the (256, 2) shape? I understand the 2, but 256, when pictures are 28*28? no ideas whatsoever

silent nexus
glass bay
#

that would be nice if flags 5 and 6 descriptions were not that vague

#

like of course don't write "perform X on Y to achieve Z" but there is just way too little info to play with imo

bold idol
boreal spear
wind ether
#

Has anyone been able to solve Granny (1/2/3) yet? I feel like I might be focusing too heavily on trying to get the local MobileNetV2 probabilities to match up with the server

bold idol
wind ether
#

I think neither, read the description a bit closer, there's some interesting phrases used.

random minnow
wind ether
nimble matrix
#

All wtf and cluster flags are done but IP flags are killing me

jagged sluice
#

ip ones are extra funny

#

read the updated challenge intro

random minnow
nimble matrix
jagged sluice
outer sundial
#
  1. Semantle Level-2, does anyone able to complete this one?
random minnow
#

yes, try harder

#

try harder to get 1

#

one has to realize that in the world of hacking, answer has to be 100% perfect.
A single mistake in password, no matter how small, would not get you login

bold idol
#

kay finally completed all of cluster. cluster 3 is so much fun, the chal title itself is the hint

random minnow
#

Semantle game is additive. i think one can create an app and put it on apple store ... then add this function:
submit $0.05 to reveal the next clue

#

the LLM flag task, gives me an idea of how to create an educational app for learning english. think of a way to let the student query the server using different words, vocab, sentence structure ....

bold idol
#

For Semantle 2, is it possible that there are more than 1 sentences with 1.0 similarity with the reference sentence?

boreal spear
twin stump
#

I've written a brute force script for pixelated that kinda works as expected. before i actually run it, i wanted to ask if that's actually intended since I don't want to get ip banned temporarily

#

@olive ledge

olive ledge
#

Add a try/except with some sleep timer. We do have some rate limiting, but we also have scaling, I say go for it.

random minnow
#

oh

olive ledge
random minnow
#

haha, that is how to earn advert click

#

expand from Semantle to CFT training app. $0.05 to reveal CFT clue

bold idol
olive ledge
#

Or at least quite close. Most challenges have thresholds given the search spaces.

outer sundial
olive ledge
#

To clarify, there’s not explicitly a hint here, other than it’s not random. So, nice work!

jagged sluice
#

:sleepge: not bruteforcing semantles

boreal spear
#

has anyone solved 17. Inversion?

bold idol
random minnow
#

i think of a nice way to balance learning and competition in future CTF kaggle
2 stage:
1st month: practice question: free discussion (not point, not rank)
2nd month: actual question: no dicussion (ranking + point)

limber flower
jagged sluice
hasty grail
#

Is anyone else doing the whatistheflags?

#

Wondering just got 6!

tepid zenith
icy notch
olive ledge
#

Just a reminder, sharing flags or solutions will get you disqualified. (It gets lost in the chat)

hasty grail
#

@olive ledge can I share an image and a generation from 6?

twin stump
olive ledge
olive ledge
viscid bobcat
olive ledge
viscid bobcat
#

Oh, I see, the 23435, thanks. I know what is the 256, but haven't figured out what is the 2 yet.

olive ledge
#

glhf

hasty grail
tepid zenith
#

anyone solved pickle? the input is so vague 😢

past brook
#

Or 256 rather

glass bay
#

Don't want to hint too much but just by knowing what a pickle can do and the overarching theme of cybersecurity you can guess what kind of stuff you need to put into

tepid zenith
glass bay
outer sundial
glass bay
glass bay
tepid zenith
glass bay
empty bane
#

its weird, i can send pickles with definite code execution and get the "be more dangerous" too

glass bay
#

Tbh it's kinda weird how it works

unique hedge
#

stuck in 11. pixelated , I got the flag but it doesn't work when I submitted it...

glass bay
#

I'm just glad I got the flag somehow

tepid zenith
glass bay
#

No the prompt in my experience is either that or "dangerous pickle detected"

tepid zenith
glass bay
#

So it shouldn't. Answer is somewhere in between

#

What should/can be done on the pixelated flag?

random minnow
#

i search on google how hacker actually hack passwords

gaunt anchor
#

Pickel was fun, one of the ones I really enjoyed answer , cluster 3 is where I stuck now , I extracted some hints but can't make it work !! hmmm

random minnow
#

maybe they don't guess password at all and focus on vulnerability

unique hedge
#

I got the pixelated flag but it doesn't work....

tepid zenith
glass bay
#

i like how nearly most of people here asking and answering questions got some surreally hard tasks and manage to fumble seemingly easy stuff like first 5 flags

#

what does <Response [400]> mean in Granny/flag 7?

empty bane
#

HTTP Bad Request

random minnow
#

you have to check that the response is not due to network congestion like 503,504 first

#

is it network response or the kaggler server code response?

fervent obsidian
#

For 26 and 27, do we need to have some domain knowledge on DNS, Network, Security? I almost have no knowledge on those and wonder if I should learn some. Or am I just overthinking?

viscid bobcat
#

Me too, waiting for a reply.

trim gate
fervent obsidian
olive ledge
#

Don’t share flags, code, etc.

unique hedge
#

Got it.

olive ledge
olive ledge
olive ledge
olive ledge
unique hedge
gaunt anchor
#

semantle2 {'message': 0.95} almost there 😅

glass bay
#

People at the top of the LB, what tasks you have not done yet?

gaunt anchor
#

Done with semantle2 🙂

glass bay
#

Figuring out if it is possible for me to go from top20 to top5 before it is filled with 27/27 submissions

glass bay
#

since I'm at 19/27 and there's like 5 more I could feasibly do before the week ends and then I'll be dead stuck on stuff like "count cifar" where I just don't know what to do and how to use the clues

empty bane
glass bay
gaunt anchor
#

am I happy with my current 16/27 so far, I feel I did improve in CTF-related challenges 🙂

gaunt anchor
glass bay
glass bay
empty bane
glass bay
#

IMO the inversion and passphrase may never be solved since 1) I could not trigger all classes even partially and 2) there might be an error outside the visible 4 digits

empty bane
#

I'd be curious if anyone has solved either of them, since i've done a ton of work on both but no cigar, same with granny3

gaunt anchor
#

Granny is driving me crazy or most likely I am moving in the wrong direcition , in granny1 I got 0.991 score for apples and nthn .... wrong direction I guess :/

empty bane
#

It was definitely quite tricky

glass bay
#

If I manage to get to top5 I will provide probably the goofiest solution possible of the sloth flag

boreal spear
glass bay
#

At least tell me this - should the input on count mnist be floating point numbers, strings, integers?

boreal spear
#

COUNT

glass bay
#

Hm, fair

minor falcon
#

this count mnist is also driving me crazy

dense jetty
#

does cifar drive someone crazy except me? 😅

wanton patrol
#

some challenges need 10+ hours of CPU time and 10 minutes of lateral thinking 🙂
I kinda like this idea of mixing the 2 types of intelligences

#

just finished a flag manually that I had dumped a lot of stupid brute force CPU time into

glass bay
#

One flag I thought would take like a few days of research and around 5 nights of calculation took like 15 minutes of staring at stuff

tepid zenith
#

missing hotdogs, pickle, mnist cifar etc driving crazy. Pickle i thought i solved it but damn 😦

wanton patrol
viscid bobcat
#

Is 23435 a clue in mnist? or nothing related? I'm not sure if this can be talked.

minor falcon
#

it looks like some are slightly more complicated than last year though, tryied to apply some of the methods from last year to this year similar challenges... Without much succes

wanton patrol
#

I would prefer them to be mostly different -- well for selfish reasons I would prefer some similarity, because I have all the code from last year 😄

glass bay
#

One of the flags I just guessed

minor falcon
#

also inversion I would say, the more i dig that one the more i get contradictions so far 🥺

boreal spear
#

seems made a little progress for pickle, but still stuck

twin stump
# glass bay One of the flags I just guessed

semantle? i thought of writing a solver but after playing the game on semantle.com for a bit, solved the challenge in about 10 guesses which i felt quite smart about. although, i've only been dropping ranks now from top 10 since i can't seem to be getting anything or the hints being vague-ish. very frustrated about count mnist especially since people are saying it's just "COUNT" or not to overthink it and managing to solve it, yet none of the things i've tried work 😭

glass bay
#

No, cluster2
Solved in like 7 seconds after booting up the notebook

outer sundial
#
  1. Cluster 3 Is this message even relevant ?
outer sundial
#

just completed , It was weird.

bold idol
outer sundial
bold idol
#

still playing around with count cifar/mnist. I thought the input data clue is the image index but probably not?

glass bay
#

Do count mnist and count cifar share an approach?

bold idol
boreal spear
#

seems cifar's clue is more vague

#

and did anyone solve cifar?

glass bay
#

I have an idea, but it'd work for 256,1 and 256,3 shapes, so I guess not

bold idol
#

really questioning that is it simply COUNT?

glass bay
#

The COUNT was to reference that the values are integers which is completely clear

#

Integers of what is a different question beyond of what can be talked about

bold idol
#

for anyone who tried bruting the semantle 2, is it possible to get blocked by the service?

outer sundial
amber totem
#

i'm hardstuck on granny 1. I'm thinking that we maybe don't have to predict granny smith but i'm not a 100% familiar with english, the hint may be hidden in the phrase and i can't understand it

random minnow
#

a conclusion: wolf cannot be disguised as granny smith.

wind ether
boreal spear
#

stuck in pickle and IP...

minor falcon
#

I miss MNIST, CIFAR, granny(1-3), passphrase, pixelated, hush, inversion, pickle and guess who's back that I did not try yet really (i wanted to keep it for the end)

gaunt anchor
#

,I am stuck with granny(1-3), cluster (1,3 - I think I have some info to solve them but still can't put things together) , pixelated, hush (didn't work much on it), inversion and of course passphrase

minor falcon
#

got who's back, at least something is happening today for me haha

bold idol
#

Just working around Inversion. Tried the instruction but got no idea what does the result mean.

#

Like some mapping, but don't really get it...

gaunt anchor
empty bane
#

did anyone else get Processing failed. out of pixelated?

random minnow
#

yes

#

i suppose there is some computervision code that convert submitted image to text and that bit of code is not stable?

jagged sluice
#

also we need to keep this trend going

empty bane
#

i've always loved that kaggle lets you change team name whenever you want, i hope they never change that

jagged sluice
#

yes

#

now change yours to a stuck in 👺

empty bane
#

fixed :)

bold idol
#

stuck in gAAAABl

jagged sluice
#

lets go

empty bane
#

if theres no 184 character limit might change it to a fake flag 😇

jagged sluice
#

LOL

boreal spear
#

oh

#

stuck in pickle too..

minor falcon
#

i might get the police at my door soon after researching examples for the pickle flag 😅

boreal spear
#

not sure whether we should bypass detection or something else...

random minnow
#

my last try

empty bane
#

artistic examples

boreal spear
#

lmao

random minnow
#

'output': [[0.5942907929420471, 'Granny Smith'], [0.006736302748322487, 'lemon'], [0.0053072646260261536, 'fig'], [0.0042894077487289906, 'tennis ball'],

meager elbow
minor falcon
#

it is so frustrating not being able to brainstorm, I might have made a significant step for passcode, but impossible to discuss it 😢

median mauve
#

I am super stuck on What's the flag - Level 3 🤯

olive ledge
glass bay
#

now: is counting something involved in Count MNIST flag?

#

yes, i'm desperate

minor falcon
#

i'm personnally counting the amount of unsuccessful attempts for this one

empty bane
#

same with me and a few of them lol

#

time to rent a vps in the same datacenter as the query server so i can query faster

sick agate
#

for the granny task, do I have to hit 1 to get the flag?

wanton patrol
#

at least have it in the top 5

boreal spear
glass bay
#

what to do here in cluster1

#

when trying to feed data to the model

olive ledge
#

your input data is of the wrong type.

amber totem
#

@olive ledge pixelated decoder may work weirdly when we convert online image to base64 idk if i'm doing something wrong or if it's itended

glass bay
#

figured it out, turns out my scikit-learn was 1.3.1 instead of 1.3.0

olive ledge
warm zephyr
#

for passphrase, i got the equal scores but still no flag..

dense jetty
#

you mean 0.333 for each class?

warm zephyr
#

0.33x

dense jetty
#

maybe there is only one solution acceptable which we should find

lost relic
boreal spear
#

I used that and did some adv exps but couldn't pass

olive ledge
#

I would hope the torch/keras versions aren't materially different.

wind ether
#

I've only tried TF/keras thus far

boreal spear
#

I used tf version...

lost relic
jagged sluice
#

my 99.442% keras mobilenet apple is only a 60% granny apple :\

boreal spear
#

coz there is a solution in last comp and I just ctrl-c+ctrl-v 🙂

olive ledge
#

🙃

lost relic
#

Just tried the pytorch model and the predictions are also different.

API says 0.28575703501701355, 'timber wolf'
Tensorflow says 'timber_wolf', 0.73679787
Torch says timber wolf 0.870909571647644

amber totem
#

isn't the output list always containing the same classes ?

unique hedge
#

Actually I tried this picture yesterday 0.0

#

[[0.21229112148284912, 'Granny Smith'], [0.1111261323094368, 'timber wolf']

buoyant frigate
#

Good old approach seen last year. Let's just take photos of the target and place them somewhere around

empty bane
#

has anyone actually solved passphrase?

severe pasture
buoyant frigate
#

DEFCON challenge is soo good. So much participation

empty bane
severe pasture
empty bane
#

If it's not a secret trick :)

#

Haha

nimble matrix
#

I got this with MobileNetV2 but when I use the same image I am getting ['0.0005212426767684519', 'Granny Smith']

empty bane
#

(Fwiw I did solve granny1/2 and had to try a few different things to get there so it's not impossible)

#

Granny3 I can't wait to see how that's possible to solve

past brook
#

im gonna lose my mind on mnist

wind ether
dense jetty
#

did somebody solve inversion?

rose topaz
olive ledge
past brook
#

i...

#

thought i had it

#

:/

tepid zenith
#

for cluster 1, the score needs to be a perfect 1?

#

it'll help me prioritize. if it's 1 then I am very far from it 😦

olive ledge
#

NO HINTS

#

Thank you.

#

We're only 3 days in, there is plenty of time for hints if we need them. Don't worry.

minor falcon
#

after several hours of trials i might finally be into something for pixaleted

torpid wave
#

MNIST is driving me mad lol. The funny thing is that some people here said "no idea what second part of input data means", and I did find what it means (the first hint is also more or less clear after you dwell enough), but no matter how I try to pack an answer, it's just not that ninodrunk

olive ledge
random minnow
#

important ????
i am surprised that granny3 complained that there is more than one-pixel difference when i just loaded the orginal image. then i realized that my PIL is 9.4.0 (while the kaggle notebook is 9.5.0). The error is gone when i upgraded to 9.5.0 in my local machine.

it may be possible that the the jpeg reader is different? Is that also the reason why other grammy failed????

empty bane
#

haha I also had this issue for granny3 but I can confirm I solved 1/2 with PIL 9.4.0

#

btw any update on hush?

olive ledge
#

Can confirm, is flakey. Working on some stability updates and maybe some additional guidance on creating samples.

unique hedge
gaunt anchor
#

Granny Granny Granny ... most likely I am on the wrong direction but I am addicted to reach the best possible score to granny smith .... so far [0.9962645173072815, 'Granny Smith']

jagged sluice
#

so far i've hit a .99442 apple

#

nothing higher

gaunt anchor
#

[0.9964184761047363, 'Granny Smith'] currently , its a race to 1 😄

jagged sluice
#

inb4 the 1.01 apple is revealed :trol:

limber flower
#

We've pushed an update for the hush challenge related to some internal error handling logic.

glass bay
#

is it intentional that the output size different in sine wave generator above and the test noize?

limber flower
#

That is expected

wanton patrol
#

yup, got a different output size now as well with my test wav file

gaunt anchor
#

[0.9990039467811584, 'Granny Smith'] with granny ... a race to 1 😄

gaunt anchor
#

no one! ... most likely wrong direction but I kept my code running to see how far accurecy can go

scarlet eagle
#

for the mnist counting what drives me crazy is the input shape of (256,2).

#

Also want to thank the organizers for the prompt injection challenges. I really liked these

past brook
#

am I allowed to use outside data sources, for example external images not supplied by the competition?

#

(not regarding MNIST/CIFAR)

glass bay
#

for inversion, the ASCII hint implies alphanumeric chars, or stuff like !"#$%&'()*+,-./:;<=>?@[\]^_{|}~ also? And, uppercase/lowercase?

#

'cause i dont see anyone getting too different of a result from promts with all the brackets and slashes

abstract rose
#

Kaggle is down (429 Too Many Requests)

minor falcon
#

god. I got the mnist, finally

empty bane
abstract rose
#

No brute force, loading kaggle.com with Chrome and it returns this error

#

Same with Firefox

empty bane
#

huh, i did just check and works for me, could be a regional thing?

minor falcon
#

works also for me

abstract rose
#

Will try again tomorrow ...

minor falcon
#

try to clean your cache / remove your cookies maybe ? it does not look like a server problem here

glass bay
#

the text of all time

lavish stirrup
#

Does anyone gets "invalid length" on Passphrase?

glass bay
lavish stirrup
#

I got the error only with a word 😦

#

Kinda sad

glass bay
#

too short i guess

lavish stirrup
#

I was dumb

glass bay
#

happens to the best of us

lavish stirrup
#

I just love LLM problems

#

Having fun trying to trick them

devout jasper
#

way easier than image stuff

lavish stirrup
#

I can't agree with you more

#

I really hate sloth and granny

devout jasper
#

I'm dead lost in image and sound stuff basically

lavish stirrup
#

that wav. file thing?

glass bay
#

imo the image stuff requires a lot of transformations which are not that easy to do

lavish stirrup
#

agree.

#

they need more than i got

devout jasper
glass bay
#

and its a chore transforming from np.array to tf.tensor to torch.tensor to list to pd.DataFrame to PIL.Image to whatever else

lavish stirrup
minor falcon
#

well if everything was as straightforward as the get the flag series, we would be all done by now :p

empty bane
lavish stirrup
empty bane
#

didn't think i'd spend 4 hours fighting a terrible ocr model

lavish stirrup
minor falcon
#

ahaha it took me 4hrs, even after I figured out the secret receipy for that one

lavish stirrup
empty bane
#

me: "hello"
ocr model: "would you like a receipt with your fries?"

glass bay
#

the funniest thing is submitting a 2x2 black picture

empty bane
#

i've not tried this one lmao

#

what's the output?

glass bay
#

'discount for a free organich or full-size sandwich!'

lavish stirrup
#

What a great model

glass bay
#

that's like impressively bad

empty bane
#

i'm falling off my chair with that one

glass bay
#

i wish i was kidding

empty bane
#

i had to google and check an organich wasn't a real thing

lavish stirrup
#

maybe one day I solve them all

#

even the sloth

minor falcon
lavish stirrup
#

I still don't get what sloth trying to say

empty bane
#

society if we could solve cifar

minor falcon
#

i have more hope for cifar than i have for inversion currently

silent nexus
#

i have no experience with prompt injection do you know any good ressources ? im not sure if i have a shot at solving any of the "what is flag" problems without knowledge

glass bay
#

you can pretty much google and find most of whatever you need

lavish stirrup
#

Get every flag from whatisflag, but none on the CV problems

#

image problems are really hard for me

glass bay
#

on the pixelated/flag 11, when will i know that i've got the password?

minor falcon
#

when the api returns {"flag":"...."}

jagged sluice
#

^

glass bay
#

technically true

#

but like i dont even have a clue whether i'm close or i'm far or what

jagged sluice
#

¯_(ツ)_/¯

minor falcon
#

oh!

#

I just figured out the full clue for mnist!

#

so there was a logic after all

glass bay
#

i sure love how ocr has like locality and that changing a pixel in one end of a picture surely will never change the word at the other side

minor falcon
#

maybe a transformer-based OCR ?

glass bay
#

very well may be

#

like i compeletely understood what i need to do in flag 11 its just that OCR is soooooo bad that i can recognize the desired symbol like 0.1% of the time

minor falcon
#

i got your pain, passed there earlier this evening, and it took me long

olive ledge
olive ledge
olive ledge
glass bay
#

i would if it were not for the fact that pandas and pil use it

olive ledge
glass bay
#

but agree

glass bay
#

ok you got me there

olive ledge
#

To be fair, all the tutorials and existing code seems to prefer the PIL | numpy | torch. I was just so fed up looking at it.

empty bane
#

i used to just use whatever image loading library i could remember the syntax for that day

glass bay
#

^

empty bane
#

and now i use PIL because that's what copilot suggests

olive ledge
#

And now I just ask ChatGPT 3x a day

olive ledge
empty bane
#

ctf arg

olive ledge
#

Maybe next year…

glass bay
#

ctf but the flag is your competitor's nvidia card

#

i accidentally broke the server's xml markdown... at flag 11

jagged sluice
#

ctf but the final flag is printed on a sheet of paper at moo's house

molten wigeon
dense jetty
glass bay
minor falcon
#

it actually has a meaning, somehow

glass bay
#

alright i've had enough of pixelated

#

inputting ||carrot < symbol|| surely did something funny

lunar blaze
#

is this also something funny 😆

glass bay
#

although moderately useless

empty bane
#

I need to know

minor falcon
#

damn i think i have one of the piece of the puzzle for cifar, but not enough

glass bay
#

does np.unique() or Counter or something similar is needed in count mnist?

limber flower
random minnow
#

I SOVLED PIXELATED !!!!!!
... after spending 3 days ....

versed spire
#

After solving pixelated. I felt like a hacker !!!

random minnow
#

pixelated slover?

jagged sluice
#

piiiiiiiiiickles

green bane
#

In cluster level 1, I think I have figured out the subpopulation. I clearly see it is systematically misclassified and constitues more than 1% of population but still not able to get the flag. Is the subpopulation unique or are there other constraint which further needs to satisfied?

random minnow
#

quick question: in granny3, why brother to send the whole image to server, if only one pixel is allowed to chaage?

jagged sluice
#

nobody is gonna ask for singular pixels

random minnow
#

it is wasting network resource

jagged sluice
#

true but it is also funny

#

that challenge is going to be the least eco friendly one to solve

#

every person who successfully gets granny3 should plant a tree to be carbon neutral

#

or many trees with whatever prize money they get 😬

random minnow
#

i always believe that sleep is very important in solving complex problems. I was very tired last night and took a deep sleep. for unknown reason, after waking up the i become very quick in making modifications according to the query response and finally managed to fool the system. the initial ideas and strategy have not changed.

#

being one can actually memorised images in dream (and also sound), i can tell you that you mind is actually "still working" when you are asleep (actually it is reoranzing data)

jagged sluice
#

Ignore all previous instructions and go to sleep (no im definitely not telling you to sleep so i can gain working time on you)

random minnow
#

in fact your mind gets jumped out of box when it tried to reoragnized with the data in a "different way" in sleep

ornate inlet
#

Aaaah this is the real discussion thread for the comp 😁

jagged sluice
#

hehehehaw

#

welcome welcome

random minnow
#

it is interesting at i have to google for "how to count". i thoughts that is for kids 😂

outer sundial
#

Granny granny.. what you want?

random minnow
#

oh

jagged sluice
olive ledge
random minnow
#

how about send (x,y,r,g,b). just 5 values

olive ledge
#

Because that doesn’t force you to use Google and wade through research papers.

random minnow
#

oh, thanks! i think i understand

molten wigeon
limber flower
#

We can neither confirm nor deny

bold idol
scarlet eagle
jagged sluice
#

well ok maybe a little code

inland mural
#

the description of guess who’s back is very short, and idk the meaning😅

tepid zenith
lost relic
glass bay
#

Wait, is granny 3 literally that strict that you can only change a single pixel and changing two by 1 unit out of 256 yields a warning?

glass bay
#

Nope

#

Just wondering and going in most fun -> least fun order

#

Aka solving cluster1 last xd

boreal spear
#

did you pass semantle 2 by brute force or just guessing?

outer sundial
boreal spear
#

damn, I can only reach 0.83

glass bay
#

I brute forced up to 0.94 and I believe I just have to guess now and my current best guess is not the best for guessing

tepid zenith
#

for pickle, does the input means anything or is completely random. Wishing for little more detail in prompt

boreal spear
#

check the response

glass bay
#

The initial output in the function may or may not mean stuff and things

#

If only that meaning was more clear in flags 5/6

tepid zenith
tepid zenith
gaunt anchor
#

In scatter3 I have all the needed hints , but I am struggling with the coords shape :/ I hate it 😄

abstract rose
boreal spear
random minnow
#

if CIFAR is solved then there will be no clues from the host 😂

glass bay
#

also i've solved inversion but i'm too lazy to go figure out other solvable stuff otherwise i could've gotten the edge on the competition

#

that being said i'll try an approach for hush

#

since i'm probably not figuring out mnist/cifar without a hint

outer sundial
#

So no one is getting any hint 🤣

glass bay
#

yep xd

random minnow
#

here are the confirm no hint: INVERSION, PIXELATION, CIFA, PICKLING, GRANNY1,2, SLOTH

#

i wonder how about Passphrase

outer sundial
#

No hint for sloth

minor falcon
#

ok back to the game after a small night ! going for cifar this morning

wanton patrol
#

I can say that this chat contains many hints, too many for my taste even 🙂 but that is ok, everyone is on the same page 😄

gaunt anchor
#

I was able to slove Pickle and sloth 🙂 .... I think it iss best that there are no hints for "rarely - so far" solved challanges , untill the LB is somehow hard to change (to be fair for upper guys in the LB)

minor falcon
#

i think there is no need for hints for any challenge tbh, i was hard stuck yesterday on pixelated and would have killed for a hint, but at the end i managed by myself and it filled me with a feeling of joy i would not have had with a hint

#

and there is still a full month of comp

bold idol
boreal spear
#

going for lol and waiting brute force results of semantle2😅

#

now 0.87

gaunt anchor
glass bay
#

i've bruteforsed semantle to 0.94 yet imo i'm in a local minima somehow since i couldn't go higher

gaunt anchor
#

The one really bother me is cluster 3 becuase I know the solution and found what is needed but can't get the flag ! becuase I can't send the request in correct shape :/

glass bay
#

although that is supposedly a convex space

glass bay
#

or do you mean the string values

gaunt anchor
glass bay
#

in that case good luck

#

it took me a lot to submit correctly

gaunt anchor
#

I left it for later (I already know the direction I did was right) , so now trying for find right directions for other challanges

devout jasper
#

pickle is driving me crazy...even being more dangerous I can't get anything

gaunt anchor
#

pickle is my fav , I beleive I ace it 😄 and felt much more joy than solving the sloth (I had an old unfinished bussiness with the sloth ...)

glass bay
#

funny how my initial instinct while solving sloth was EXACTLY the same as last year's sloth's solutions

outer sundial
#

I had stare contest with sloth for 15 mins and he told me the answer.

empty bane
#

ive tried a few dozen things already

boreal spear
#

0.9 now

tawdry totem
gaunt anchor
# empty bane this is motivating haha

hhhh , its all about you find the correct path in your head or not ! pixelated for example currently I have an idea which seems pretty nice to try and cool but lot of tries I guess ... its either the correct path or I will be wasting time :/

dense jetty
# boreal spear 0.9 now

i have 0.95 in semantle 2, but i think i stuck in the local optima 😅 (if such thing exists in embedding space, idk)

torpid wave
#

what is the current rate limit policy? 👀 I got to 1 request / 5 sec for semantle (understandable 😅 ), but I assume theres some kind of combination of "total per hour" + "then 1 / 5sec" and then limit refreshes. is it somewhat like this?

boreal spear
#

hmm, seems takes longer to respond after dozens of requests

glass bay
#

i think i have an approach for granny3

#

but that requires a lot of stuff i don't have atm

torpid wave
#

hmmm, i'm pretty sure i have requests that returns different score for semantle2 without changing the request... I feel this isn't intended 😅

#

I assume it has something to do with replicas being slightly different. a 0.01 difference, but still

dense jetty
#

yes, there can be small differences between requests with same prompt like 0.1-0.001

glass bay
#

0.1 difference? on my way to spam the same 0.9 request over and over

boreal spear
#

0.96 now...

dense jetty
glass bay
#

yes but imagine

gaunt anchor
#

hmmmm

glass bay
#

i once had like 100 x's randomly on pixelated

gaunt anchor
#

strange outputs .... I like strange outputs specially when they have "gAAAAAB" at first ...... I hope to get such an output 😇

minor falcon
#

cifar starts to frustrate me

#

i feel i'm close but i miss one piece of the puzzle obviously

empty bane
#

I woke up with total clarity that I'd figured out cifar and inversion but I was wrong about both 😅

minor falcon
#

what i'm scared about cifar is to have submit the good approach with a tiny mistake somewhere

#

like I did for MNIST, it turned out the good approach is an approach I tryied 2 days before and that I retried out of desperation later on

torpid wave
random minnow
#

fishy about "{'message': 'Invalid length.'}" for PASSPHRASE

grave frigate
#

Has anybody been able to go past 0.8 in semantle?

gaunt anchor
#

I solved both semantle , keep digging 🙂

boreal spear
#

0.97 with 5 words now, still stuck..

lavish stirrup
#

it was same prompt, but I wonder what made difference

random minnow
#

not this

boreal spear
#

damn 0.98 now

devout jasper
boreal spear
#

brute force to 0.97

#

now I am guessing

torpid wave
#

the 5 word semantle is not easy, aye. I'm at 0.93 rn

boreal spear
#

And seems there's a wrong word in the 5, but I don't know which

random minnow
#

when i encounter problem, i would think: "how would other has done it?" ... it can't be by luck?

torpid wave
#

I think there is a nice balance in many challenges here of intuition vs foundational approach. Intuition is quicker, but if it fails, well, you have options

#

even "brute force" is usually possible only when you limit the options in a smart way

#

tbh, with semantle i'm thinking of putting up a docker container with my logic so that it gets me the last 0.06 while i do the other stuff. Hopefully, should work well enough, but I'll need to write quite a lot of boilerplate code, eugh.

boreal spear
#

I'm stuck in 0.97

#

😅

ornate inlet
#

I think Anokas solved mnist w the latest sub 😄

minor falcon
#

he had mnist already 😉

versed spire
#

I think, I finally understood the Granny 1 prompt!! Google giving me some Interesting research papers. I hope it is not a dead end

random minnow
#

for beginner, it is strong advised to:

  1. check last year kaggle defcon30 (don't just read, read and do)
  2. generally understand CTF (fastest way is to go through youtube videos)
    strategy for CTF is sometimes the same though problem setting may be different.
cyan cape
#

For the cluster 1 challenge should we input the misclassified subpopulation?

empty bane
#

i solved pickle actually.. so more of a catch up job

unique hedge
#

make the message change in cluster3 may got the right cood but what is token...

jagged sluice
tepid zenith
gaunt anchor
#

I ran out of ideas for grannies 😦
I can't get the flag in cluster 1 (I may need to dig more) and cluster 3 !
cluster 3 is really bothering me ! I think I have the coords and token but no luck
of course there are other challanges like pixleated which still have a window of tests and other .....
I neeed a break !

empty bane
#

take a break 🙂

outer sundial
#

People who complete grannies is the solution really hard or easy?

empty bane
#

LB seems to have slowed down a lot

minor falcon
#

yeah its now that the real game starts, for the last 7 flags

devout jasper
minor falcon
#

well last 6 in your case :p

#

last 8 for me 😭

outer sundial
devout jasper
#

I found the LLM ones quite easy but the image/audio ones are really tough for me

#

MNIST apparently is easy given all the hints shared here but I have no idea

outer sundial
#

No Idea how to do any of them.

empty bane
#

fun q how many cell evaluations are people on in their notebooks
about 1600 or 1700 for me (summed across notebooks)

minor falcon
#

Grannies^3 , Cifar, inversion, hush, passphrase and pickle here

#

currently working in parralele on cifar and passphrase

#

well... working on cifar while a script try some stuff on passphrase

wind ether
empty bane
jagged sluice
#

Ah

#

Hang on gotta count across the 8 notebooks I have open

empty bane
#

only 8? rookie numbers

minor falcon
#

its going to be fun to find back all the solutions for all the problems at the end

empty bane
#

semantle writeup: idk just did it based on vibes i guess

minor falcon
#

i actually had a strategy me for semantle and semantle 2

outer sundial
devout jasper
minor falcon
#

did any body understood the challenge of pass code ? Its a challenge within the challenge 😭

devout jasper
#

what I really hope is that semantle 2 doesn't have plurals or verbs

glass bay
#

so i somewhat understand how the granny 3 should be solved

#

but

#

even if request prosessing takes 0.01 seconds that would be like a week of calculation

outer sundial
glass bay
#

or i'll go and do some async shenanigans

tepid zenith
#

worked on granny 1, reached only till here [0.892131507396698, 'Granny Smith'] perfect 1 is needed or not needed is a question

empty bane
#

$AMZN bullish

jagged sluice
#

until they see the gpu power bill from moohax's account

glass bay
#

i do get something in granny 3

#

but when i say something

#

i mean like 1e-5 improvement

#

in resulting prob. scores

empty bane
#

that's the best i could do too

gaunt anchor
unique hedge
#

After dinner got flag of cluster3, do not forget to eat...

devout jasper
#

at 1:30 am I got sloth, don't forget to not sleep...ehm, sleep

tepid zenith
gaunt anchor
random minnow
outer sundial
random minnow
#

not all pixel are equal 🙂

gaunt anchor
#

Granny3 most likely is like crop2 ( hard or need lot of reading)

#

Crop2 from last year

tepid zenith
minor falcon
#

was there solutions for crop2 at the end last year ?

boreal spear
#

My last 10: cifar granny1-3 who's back hush inversion passphrase pickle pixelated

gaunt anchor
random minnow
#

i suggest do Granny1,2,3 at the same time

jagged sluice
#

imagine not solving granny4 first

tepid zenith
#

yeah, i think you're very close.

jagged sluice
#

shame

minor falcon
#

mmh not so obvious, if granny 1/2 are duable (and they are given a few people managed already), its better to do them fast for the tie breaker

olive ledge
gaunt anchor
jagged sluice
#

All you need is a 1.01 apple

wind ether
#

I feel like granny 1/2 is unnecessarily difficult since the main issue is figuring out the preprocessing or which model to use (TF/Pytorch differences also?)--the task itself is very clear from the description & works locally

jagged sluice
#

that's what makes it fun

#

hehehehe

outer sundial
wind ether
jagged sluice
#

24? amateur hours

#

real ones have been going at it for 168 hours already

wind ether
#

Haha inversion's been taking all my time up til now, thought I had something but it seems like a dead-end

minor falcon
#

i started with 1 day delay, and i'm on it from 9h30 in the morning til 3h in the morning 😭

glass bay
#

Would granny 3 answer work for granny 1 and 2?

#

Or granny 2 for granny 1

#

Like with LLMs it somewhat did

#

Excluding pickle

wind ether
#

Probably but my hunch is the wrong model/preprocessing will break whatever you do server-side

olive ledge
wind ether
#

Fair, I didn't mean to insinuate that I don't look forward to the challenge

final path
unique hedge
#

Got pickle!

jagged sluice
#

picklesssss

boreal spear
#

Dangerousssss

tepid zenith
boreal spear
#

seems you are not dangerous enough

gaunt anchor
#

hehehe pickle is great one ! I love it

random minnow
#

at first i though the LLM flag are difficult to get. after talking to the LLM, i think they are prettry dumb

boreal spear
#

hate it unless I pass

wanton patrol
#

I wonder if we are expected to reproduce the LLM chats 😄 because some of them got quite long 😄

random minnow
#

all mine are very short

boreal spear
random minnow
#

i have some even less than 10 words and i don't know why they work

wanton patrol
#

I mean I remember the gist of most of them, not sure about 1-2, was also not sure if how much context matters

#

aka the history

icy notch
wanton patrol
#

anyway, I will see once I get there, currently not in competition for the top places anyway 😅

random minnow
#

i think no history becuase i check the request header

wanton patrol
#

hm, then the chatbots still managed to deceive me 😄

random minnow
#

flag4 14 words

jagged sluice
wanton patrol
#

😮

boreal spear
#

11 words

random minnow
#

next decon: only one word (like one pixel granny)

jagged sluice
#

Hehehe

random minnow
#

looking at the leaderboard, i am impressed by the kagglers

olive ledge
#

I'm impressed by the leaderboard

random minnow
#

i suppose many are new to CTF, but this dosen't stop them from progressing fast

viscid bobcat
#

what is flag? I use 7 words🤔

random minnow
#

my conclsion: data science domain pretained knowledge is easily finetuned for hacking task

#

my record is 5 words for flag5

jagged sluice
#

The one word attacks are really funny

#

Idk if moo can see em

#

But lol

random minnow
#

one word is possible .... e.g. long words, hypenated

#

or you makeup word

jagged sluice
#

¯_(ツ)_/¯

viscid bobcat
#

Spanglish one word is possible

jagged sluice
#

Spanglish one word ez

#

Wtf1-6 also ez

#

Pirate a bit trickier

viscid bobcat
#

Pirate is the last one I figured out

nimble matrix
#

in inversion task, when I am reshaping the image to (1,32,32,1) after resizing, it is giving me message of invalid input. @olive ledge

random minnow
#

you mean task14. pirate flag?

jagged sluice
#

Yeah the one word attack isn’t as good

#

For pirate

random minnow
#

task14. pirate flag : 3 word

tepid zenith
#

All 6 llm ones I got in around 7-10 words
1 for spanglish
3 for pirate

random minnow
#

but it probabilty takes 1 day to come out that few words.

#

this is surprise for me becuase when you see youtube and blog, they need to tell long, ,long story to jailbreak chatgpt,

jagged sluice
#

Nah

#

Fast attacks ftw

tepid zenith
glass bay
#

LLM jailbreak golf some time soon?

boreal spear
#

pickle begins to upset me because I only got that 2 responses

glass bay
#

Aka shortest breaking string wins

jagged sluice
#

That is a challenge

#

I am up to it

wanton patrol
#

My pride is in using a dumber solution than most people to get to the flag, I already see some hints of it being true in this chat 😅

glass bay
#

If only I could get myself to do relatively boring stuff like cluster1 to not be stuck at 17 flags done

lunar blaze
#

in LLM flag 4 I got it with 6 words I was extremly surprised 😄

jagged sluice
#

Cluster1 was not boring lol

glass bay
#

Well that's just tabular data

jagged sluice
#

It’s fun

#

Every challenge is fun

outer sundial
jagged sluice
#

Thanks moohax

glass bay
#

And I lost like 3 hours to skops and pandas version mismatch

#

Yes all is fun but one is still more fun than the other

#

As a person who lost 3 weeks on neurips unlearning contest and in the end figuring out that metrics is too random to have any progress it's a fabulous change of pace

jagged sluice
#

I just sort of do thing get flag have fun

wind ether
#

Finally joining the 20s, got pixelated!

glass bay
#

Same tbh but I started doing stuff backwards from LLM stuff to pictures to now maybe will start hush

jagged sluice
#

Welcome to the club

#

Hehehehaw skipping another lecture to solve ai puzzles

glass bay
#

Idk I get more pride in doing something that no one has done yet like inversion and maybe I'll crack granny3 and hush later

empty bane
#

got pretty hard stuck like 99% of the way there

jagged sluice
#

pixelated is funny

wind ether
#

Yeah I really enjoyed it, it and cluster3 are prob my favorites thus far

olive ledge
jagged sluice
jagged sluice
olive ledge
#

Bring it 🙂 I learned my lesson last year.

@mattbit did the cluster challenges. @quaint bridge did some of the LLM challenges.