#mabe-mouse-behavior-detection | Kaggle | Page 1

rough grail Sep 21, 2025, 12:44 PM

#

Hi everyone, I have a few questions regarding the data for this competition.

In the data section, we're told that in a folder like train tracking, there are things like video_frame, mouse_id, etc. But in the actual folder (train tracking) it's just more folders with parquets in them. This has thrown me off a bit because I don't see any video_frame, mouse_id, etc in those folders.
I've taken a look at other people's code, and from train_csv they're somehow able to get a parquet's path by "/input/kaggle...train_csv/{lab_id}/{video_id}.parquet" which then shows another dataframe-esque thing. How does that work?

orchid reef Sep 22, 2025, 12:50 AM

#

rough grail Hi everyone, I have a few questions regarding the data for this competition. 1....

Each parquet file can be loaded into a dataframe just like train.csv can. The parquet files all have the columns like video_frame, mouse_id, etc that you're looking for. Most code you see does indeed start with reading train.csv since this functions as an index containing metadata about all the videos represented in the parquet files. For some tasks you might just as well simply get a recursive folder listing to find the individual files, but sooner or later you're going to want to take advantage of other things that train.csv tells you about the file so it makes sense to start there.

rough grail Sep 22, 2025, 1:04 AM

#

orchid reef Each parquet file can be loaded into a dataframe just like `train.csv` can. The ...

thanks for the reply will, but i still dont quite understand your explanation. with the way everyone is making their code, it seems like they can access a parquet and its data from train.csv even though train.csv and train_tracking/train_annotation completely separate folders (and filepaths) in the input data for the competition.

i do wish, though, to reiterate my thanks for your reply, for some reason this subchannel of the discord is completely inactive, and i thought no one would respond 😅

orchid reef Sep 22, 2025, 1:46 AM

#

rough grail thanks for the reply will, but i still dont quite understand your explanation. w...

I guess it would help if you had a specific example you're looking at. The top voted notebook at present, for example, here composes the path to each parquet file with the line path = track_dir / lab_id / f"{video_id}.parquet" and then reads into a dataframe with trk = pl.read_parquet(path) . lab_id and video_id in that context come from looping through the train.csv dataframe.

rough grail Sep 22, 2025, 11:45 AM

#

orchid reef I guess it would help if you had a specific example you're looking at. The top v...

Yeah, that line is perfect for my question. Im confused by how lab_id and video_id come from looping through train.csv when train.csv doesn’t seem to have lab_id or the video_id columns in it, but clearly it does because they can access it, but the fact they’re able to do that is what confuses me

orchid reef Sep 22, 2025, 1:31 PM

#

rough grail Yeah, that line is perfect for my question. Im confused by how lab_id and video_...

I'm confused why you're not seeing those columns! They're the first columns listed in the dataset description for train/test.csv and the first columns present in the actual file too...

rough grail Sep 22, 2025, 5:25 PM

#

orchid reef I'm confused why you're not seeing those columns! They're the first columns list...

I can see them, it’s just the fact that they can get parquet files in train_tracking FROM train.csv which confuses me. Train.csv and train_tracking are completely separate filepaths but somehow they can search for things in train_tracking by train.csv

candid crypt Sep 22, 2025, 7:37 PM

#

Hi kinda rookie question but just wanted to confirm can I use a pre trained model , or we have to train one from scratch for this challenge?

orchid reef Sep 22, 2025, 11:19 PM

#

you can use a pretrained model, so long as your submission notebook accesses it without internet access (i.e. it's stored in a kaggle dataset or model) and the model's license permits use in the competition context

mellow tree Sep 25, 2025, 4:54 PM

#

Since the dataset is in video format, what basics should I know to preprocess it properly, make labeling correct, and improve model accuracy?

orchid reef Sep 26, 2025, 2:30 AM

#

mellow tree Since the dataset is in video format, what basics should I know to preprocess it...

the dataset is not in video format.

glacial storm Sep 26, 2025, 12:30 PM

#

anyone looking for a team as well would like to team up with me? i'm a 19 year old begginer in kaggle competitions willing to learn

mellow tree Sep 26, 2025, 11:34 PM

#

orchid reef the dataset is not in video format.

ss its in video frame

swift rapids Sep 27, 2025, 12:23 PM

#

Not sure if this is the place to ask, but here goes

Can there be new labs present in hidden test data?
MABe22_keypoints and MABe22_movies are a large part of given trackjng files but none of its videos are annotated, is this by design or Im missing something?

old flower Sep 30, 2025, 8:13 PM

#

swift rapids Not sure if this is the place to ask, but here goes 1. Can there be new labs pre...

According to the host, here are no new labs in the test set: https://www.kaggle.com/competitions/MABe-mouse-behavior-detection/discussion/608621#3292419

narrow moon Nov 12, 2025, 3:26 AM

#

Hi

azure spade Nov 12, 2025, 7:10 PM

#

!rank

torn finch Dec 1, 2025, 5:31 PM

#

https://media.discordapp.net/attachments/1444971360047726605/1445085758598938824/image1.gif?ex=692f107d&is=692dbefd&hm=94f18cd6e7350e7cc612826beb5d11a9fd125485a58ee1e39a16a03b6f9e2426&=&width=237&height=315
https://media.discordapp.net/attachments/1444971360047726605/1445085766937088000/image2.gif?ex=692f107f&is=692dbeff&hm=51e8429e6818b166e21485a613e8f0c706d64c765aefc93f65a7bcefa10907c2&=&width=864&height=1152
https://media.discordapp.net/attachments/1444971360047726605/1445085774562197535/image3.gif?ex=692f1081&is=692dbf01&hm=e520e8e4edd4eea02e82168a7059a868ea59c19d9b90c7c34402f7bb3616c76f&=&width=864&height=1152
https://media.discordapp.net/attachments/1444971360047726605/1445085781801566319/image4.gif?ex=692f1082&is=692dbf02&hm=bdc0715977fdcda4b7804916e5bfb36af1d3132f535d1b4327894a067fbfc769&=&width=725&height=907

heady plover Dec 4, 2025, 6:18 AM

#

I have now submitted 3 times, but it is always getting failed saying wrong csv file format, but I think it is good and I am using there sample submission, eg. of my csv file rows:
row_id
video_id
agent_id
target_id
action
start_frame
stop_frame
0
438887472
mouse1
mouse2
approach
846
848
1
438887472
mouse1
mouse2
approach
880
883

can someone guide me, what's wrong here

narrow fractal Dec 4, 2025, 7:59 AM

#

Connect to 846

heady plover Dec 4, 2025, 12:02 PM

#

narrow fractal Connect to 846

Didn't understand your advice, please elaborate

heady plover Dec 5, 2025, 5:07 AM

#

Still not working, can anyone advice me a little?

raven hawk Dec 6, 2025, 1:05 PM

#

hello everyone, i have some questions about competitions in kaggle,
Submissions to this competition must be made through Notebooks. In order for the "Submit" button to be active after a commit, the following conditions must be met:

CPU Notebook <= 9 hours run-time
GPU Notebook <= 9 hours run-time
Internet access disabled
Freely & publicly available external data is allowed, including pre-trained models
Submission file must be named submission.csv

will these be assesed based on the notebook you use to train your (pre-trained) models? what if i use another source like colab pro i and just save the best models the upload to kaggle to inference, does that count as cheating or anything?