solid tree Aug 11, 2023, 7:38 PM

#

Tweet data is always fun (or is it "X" data now?). Disaster vs non-disaster... so much data yet so much noise to filter out. How did this comp go for you?

blissful citrus Aug 17, 2023, 6:28 AM

#

I am Currently working on the kaggle nlp twitter disaster project and I have reached a rock bottom with my code and I am not sure what I am doing wrong..please I need help

#

https://colab.research.google.com/drive/13onC89vY3XeIHLv4uBes5dA79BHlULL4#scrollTo=M_u2-cHwrmrw

Google Colaboratory

odd acorn Aug 17, 2023, 8:14 AM

#

blissful citrus I am Currently working on the kaggle nlp twitter disaster project and I have rea...

The error seems to be clear: "NameError: name 'preprocess' is not defined"

blissful citrus Aug 17, 2023, 1:25 PM

#

odd acorn The error seems to be clear: "NameError: name 'preprocess' is not defined"

Thank you for this reply..I have to use finalprocessing(string) but it threw a 'RecursionError: maximum recursion depth exceeded'

#

sorry i meant..i had used the ''return lemmatizer(stopword(finalpreprocess(string)))'' and it threw the recursionError

elder dew Aug 28, 2023, 8:49 PM

#

I have an error message that goes "keras_nlp does not have attribute version"

#

odd acorn Aug 28, 2023, 10:15 PM

#

elder dew

"keras_nlp does not have attribute version" - what is unclear in that message? Unlike most python packages, you can't find out about the version by using this line. You can either dig deeper to find out where this package stores its version, or simply deleted that line as it doesn't contain anything that is truly necessary for the script to run.

elder dew Aug 28, 2023, 10:20 PM

#

okay thqnks

#

just commented it out

ember dew Aug 30, 2023, 5:52 AM

#

since there is no longer free/cheap access to tweets, what do people usually use instead of them as a source of real-time text stream?
https://twitter.com/XDevelopers/status/1641222782594990080

Developers (@XDevelopers)

Today we are launching our new Twitter API access tiers! We’re excited to share more details about our self-serve access. 🧵

Likes

1479

Retweets

1071

silver hedge Jan 14, 2024, 6:43 PM

#

Load a DistilBERT model.

preset= "distil_bert_base_en_uncased"

Use a shorter sequence length.

preprocessor = keras_nlp.models.DistilBertPreprocessor.from_preset(preset,
sequence_length=160,
name="preprocessor_4_tweets"
)

Pretrained classifier.

classifier = keras_nlp.models.DistilBertClassifier.from_preset(preset,
preprocessor = preprocessor,
num_classes=2)

classifier.summary()

#

Can anyone help me with this error msg?

bold hollow Jan 28, 2024, 11:36 PM

#

Hello I am training a model with X_train.shape = (60, 40) and y_train.shape = (60,). Then my model code is as follow model=Sequential()
###first layer
model.add(Dense(32,input_shape=(60,)))
model.add(Activation('relu'))

###second layer
model.add(Dense(10))
model.add(Activation('relu'))
###third layer
model.add(Dense(10))
model.add(Activation('relu'))

###final layer
model.add(Dense(num_labels))
model.add(Activation('softmax'))
but it is giving this error

#

what should I do. Pls help

glass valve Jan 31, 2024, 2:48 AM

#

bold hollow what should I do. Pls help

read the error. it's expecting a different number of inputs than you're giving it

#

you need to update the layer architecture to fix the bug

bold hollow Feb 22, 2024, 7:19 PM

#

I am extracting features from audio signals and now want to compare them with the real voice input. But I am unable to figure out how to do it. If anyone can guide me for this I will be really grateful.

bold hollow Feb 22, 2024, 8:26 PM

#

glass valve you need to update the layer architecture to fix the bug

issue resolved

bold hollow Feb 22, 2024, 8:26 PM

#

bold hollow I am extracting features from audio signals and now want to compare them with th...

somebody guide with this

glass valve Feb 22, 2024, 8:28 PM

#

Voice sounds unrelated to disaster tweets which is this channel

daring zinc Mar 1, 2024, 4:13 AM

#

Hi, i want to join this compitition

livid mesa Apr 30, 2024, 9:24 PM

#

Hello! I was wondering, in the getting started notebook for this competition, what are the three numbers within the array when finding your score?

#

neon flint May 1, 2024, 8:01 AM

#

livid mesa

its individual score for each time the cross validation run. since you chose cv = 3, its returns 3 outputs

livid mesa May 1, 2024, 7:14 PM

#

Why does this only show 5 of the predicted data points? How do we see the submission file? I can’t find the viewer it mentions.

drifting dew May 1, 2024, 10:22 PM

#

livid mesa Why does this only show 5 of the predicted data points? How do we see the submis...

"the function "head" returns the first 5 rows of a file, that is why it only shows the first 5.

#

You are currently looking at the code editor. Once you click "save version" in the top right you can look at the notebook in the viewer and then under the files tab can submit the output to the competition.

Alternatively, you should see options on the right hand side of the screen in the editor to submit to the competition.

livid mesa May 2, 2024, 4:10 AM

#

Thank you! What is clf? For example: scores = model_selection.cross_val_score(clf, train_vectors, train_ …)
Is it the model being used for the estimator? What does it stand for?

tender sentinel May 3, 2024, 5:04 PM

#

anyone interested in doing this project together

whole wolf May 17, 2024, 12:32 PM

#

livid mesa Thank you! What is clf? For example: scores = model_selection.cross_val_score(cl...

Yes, it's usually the instance of the model. It stands for something like 'CLassiFier' I guess

wooden plover May 21, 2024, 9:01 AM

#

Anyone interested to do this project together?

sour furnace May 28, 2024, 10:37 AM

#

wooden plover Anyone interested to do this project together?

I am looking for a partner!

sour furnace May 28, 2024, 10:37 AM

#

tender sentinel anyone interested in doing this project together

Me

sour furnace May 28, 2024, 5:31 PM

#

anyone up for discussion? I am at EDA

odd gazelle May 30, 2024, 10:50 PM

#

Hey all, this is my first NLP project and I am very excited to join you! I have a week with this competition already and I'm trying to improve my score so I'll be checking this constantly ☺️

knotty pivot Jun 2, 2024, 7:02 AM

#

Hiii I wanted some help with the model DistilBert. I was referring to a notebook https://www.kaggle.com/code/alexia/kerasnlp-starter-notebook-disaster-tweets/notebook#Load-a-DistilBERT-model-from-Keras-NLP. But I don't understand the reason for using the parameters in preprocessor = keras_nlp.models.DistilBertPreprocessor.from_preset(preset,
sequence_length=160,
name="preprocessor_4_tweets")

Why did they take take sequence length as 160 ?
Do we have to figure that out ourselves or any reference is given ?

KerasNLP starter notebook Disaster Tweets

Explore and run machine learning code with Kaggle Notebooks | Using data from Natural Language Processing with Disaster Tweets

#

It is in the Load a DistilBERT model from Keras NLP section of the notebook

sterile plank Jun 11, 2024, 10:21 AM

#

anyone interested in learning nlp through competition? PyTorch person please

azure ingot Jun 30, 2024, 6:29 AM

#

hey im a beginner in NLP, i thought i could ask, if i should go with LSTM model or is there any suggestions for alternative models ?

azure ingot Jun 30, 2024, 6:45 AM

#

also, while im using pytorch, i wanna know some tokenization techniques because torchtext doesnt seem stable, any suggestions ?

vague oyster Jul 22, 2024, 10:29 AM

#

https://www.kaggle.com/code/lordtenson/disaster-tweets-classification

Disaster Tweets Classification

Explore and run machine learning code with Kaggle Notebooks | Using data from multiple data sources

#

If anyone interested to improve the accuracy I got : 0.82

rough breach Aug 16, 2024, 6:42 AM

#

Hi guys, When does the competition start and end?

vapid charm Aug 16, 2024, 2:27 PM

#

rough breach Hi guys, When does the competition start and end?

It's always active

#

The getting started competition don't have a start or end time, they are always open

rough breach Aug 16, 2024, 2:34 PM

#

And Is there evaluation?

empty crypt Aug 18, 2024, 2:03 AM

#

FOR BEGINNERS:
https://www.kaggle.com/code/vishalyginny/natural-language-processing
Here is my code if anyone is interested for starting out this competition. My code is quite simple and easy to understand.

Natural Language Processing

Explore and run machine learning code with Kaggle Notebooks | Using data from Natural Language Processing with Disaster Tweets

rancid wolf Aug 25, 2024, 6:41 PM

#

empty crypt FOR BEGINNERS: https://www.kaggle.com/code/vishalyginny/natural-language-process...

hey vishaly, thanx for the code!! I went through it but why are we imputing keyword with fatality? it has only 61 missing values so isnt it better to drop them? coz considering every missing keyword as fatality might cause a bias? Im new to this so im not sure but i would love to discuss about it

empty crypt Aug 26, 2024, 7:52 AM

#

rancid wolf hey vishaly, thanx for the code!! I went through it but why are we imputing keyw...

Hey! Just like you said, it only has 61 missing values, we can't always drop columns, if the number of missing values isn't large enough, then it's better to impute them with mean or mode. This way, if the column is relevant to the target column, the values of the imputed column are important.

rancid wolf Aug 26, 2024, 9:08 PM

#

empty crypt Hey! Just like you said, it only has 61 missing values, we can't always drop col...

Oh okay vishaly!! Will keep this in mind!!

vapid charm Sep 1, 2024, 10:30 PM

#

Hey Guys, any youtube video you recommend about this competition? I'm really struggling on this one

wintry bison Sep 8, 2024, 10:15 PM

#

hello guys, i have just started this competition, any one that want to collaborate???

rigid wolf Oct 15, 2024, 10:28 PM

#

wintry bison hello guys, i have just started this competition, any one that want to collabora...

Are you still interested?

last sandal Oct 16, 2024, 4:46 AM

#

link to the comptn?

potent island Oct 16, 2024, 6:18 AM

#

rigid wolf Are you still interested?

Hi Amit, I am interested in a colab. But I am kinda novice.

cedar turret Oct 19, 2024, 7:14 AM

#

Hi everyone. I’m looking for suggestions on tackling this problem. I have about a 100,000 unlabeled job description data that I’m trying to use to determine the category of job. For example, from a job description text I want to know if it’s in IT, Software, Admin/Clerical etc. I tried using pre trained models from hugging face transformers but it didn’t work well. I have thought about labeling the data but it would take time to do it for a 100,000.

rancid kestrel Nov 8, 2024, 6:02 PM

#

Has anybody tried to include the "keyword" and "location" columns into your model? All the notebooks I looked at so far didn't include these columns. Anyway, if you did include them, how did you encode them? The "keyword" column has ~222 unique values and the "location" column has ~3341 unique values. I don't think one-hot-encoding makes sense in this case. Any thoughts?

reef hull Nov 17, 2024, 7:13 AM

#

@tawny shore please don't share irrelevant content

potent island Nov 18, 2024, 2:35 AM

#

rancid kestrel Has anybody tried to include the "keyword" and "location" columns into your mode...

you can try target encoding

bitter birch Nov 23, 2024, 12:01 PM

#

is there anyone doing this project right now? I wanna join.

graceful sleet Nov 9, 2025, 7:25 AM

#

Hii

dull otter Feb 8, 2026, 12:59 PM

#

https://www.kaggle.com/code/suhanigupta04/disaster-tweet-classification-using-roberta

#🚑┊nlp-with-disaster-tweets

Load a DistilBERT model.

Use a shorter sequence length.

Pretrained classifier.