#🚑┊nlp-with-disaster-tweets

1 messages · Page 1 of 1 (latest)

solid tree
#

Tweet data is always fun (or is it "X" data now?). Disaster vs non-disaster... so much data yet so much noise to filter out. How did this comp go for you?

blissful citrus
#

I am Currently working on the kaggle nlp twitter disaster project and I have reached a rock bottom with my code and I am not sure what I am doing wrong..please I need help

odd acorn
blissful citrus
#

sorry i meant..i had used the ''return lemmatizer(stopword(finalpreprocess(string)))'' and it threw the recursionError

elder dew
#

I have an error message that goes "keras_nlp does not have attribute version"

odd acorn
# elder dew

"keras_nlp does not have attribute version" - what is unclear in that message? Unlike most python packages, you can't find out about the version by using this line. You can either dig deeper to find out where this package stores its version, or simply deleted that line as it doesn't contain anything that is truly necessary for the script to run.

elder dew
#

okay thqnks

#

just commented it out

ember dew
silver hedge
#

Load a DistilBERT model.

preset= "distil_bert_base_en_uncased"

Use a shorter sequence length.

preprocessor = keras_nlp.models.DistilBertPreprocessor.from_preset(preset,
sequence_length=160,
name="preprocessor_4_tweets"
)

Pretrained classifier.

classifier = keras_nlp.models.DistilBertClassifier.from_preset(preset,
preprocessor = preprocessor,
num_classes=2)

classifier.summary()

#

Can anyone help me with this error msg?

bold hollow
#

Hello I am training a model with X_train.shape = (60, 40) and y_train.shape = (60,). Then my model code is as follow model=Sequential()
###first layer
model.add(Dense(32,input_shape=(60,)))
model.add(Activation('relu'))

###second layer
model.add(Dense(10))
model.add(Activation('relu'))
###third layer
model.add(Dense(10))
model.add(Activation('relu'))

###final layer
model.add(Dense(num_labels))
model.add(Activation('softmax'))
but it is giving this error

#

what should I do. Pls help

glass valve
#

you need to update the layer architecture to fix the bug

bold hollow
#

I am extracting features from audio signals and now want to compare them with the real voice input. But I am unable to figure out how to do it. If anyone can guide me for this I will be really grateful.

bold hollow
glass valve
#

Voice sounds unrelated to disaster tweets which is this channel

daring zinc
#

Hi, i want to join this compitition

livid mesa
#

Hello! I was wondering, in the getting started notebook for this competition, what are the three numbers within the array when finding your score?

neon flint
# livid mesa

its individual score for each time the cross validation run. since you chose cv = 3, its returns 3 outputs

livid mesa
#

Why does this only show 5 of the predicted data points? How do we see the submission file? I can’t find the viewer it mentions.

drifting dew
#

You are currently looking at the code editor. Once you click "save version" in the top right you can look at the notebook in the viewer and then under the files tab can submit the output to the competition.

Alternatively, you should see options on the right hand side of the screen in the editor to submit to the competition.

livid mesa
#

Thank you! What is clf? For example: scores = model_selection.cross_val_score(clf, train_vectors, train_ …)
Is it the model being used for the estimator? What does it stand for?

tender sentinel
#

anyone interested in doing this project together

whole wolf
wooden plover
#

Anyone interested to do this project together?

sour furnace
sour furnace
#

anyone up for discussion? I am at EDA

odd gazelle
#

Hey all, this is my first NLP project and I am very excited to join you! I have a week with this competition already and I'm trying to improve my score so I'll be checking this constantly ☺️

knotty pivot
#

Hiii I wanted some help with the model DistilBert. I was referring to a notebook https://www.kaggle.com/code/alexia/kerasnlp-starter-notebook-disaster-tweets/notebook#Load-a-DistilBERT-model-from-Keras-NLP. But I don't understand the reason for using the parameters in preprocessor = keras_nlp.models.DistilBertPreprocessor.from_preset(preset,
sequence_length=160,
name="preprocessor_4_tweets")

Why did they take take sequence length as 160 ?
Do we have to figure that out ourselves or any reference is given ?

#

It is in the Load a DistilBERT model from Keras NLP section of the notebook

sterile plank
#

anyone interested in learning nlp through competition? PyTorch person please

azure ingot
#

hey im a beginner in NLP, i thought i could ask, if i should go with LSTM model or is there any suggestions for alternative models ?

azure ingot
#

also, while im using pytorch, i wanna know some tokenization techniques because torchtext doesnt seem stable, any suggestions ?

vague oyster
#

If anyone interested to improve the accuracy I got : 0.82

rough breach
#

Hi guys, When does the competition start and end?

vapid charm
#

The getting started competition don't have a start or end time, they are always open

rough breach
#

And Is there evaluation?

empty crypt
rancid wolf
empty crypt
rancid wolf
vapid charm
#

Hey Guys, any youtube video you recommend about this competition? I'm really struggling on this one

wintry bison
#

hello guys, i have just started this competition, any one that want to collaborate???

last sandal
#

link to the comptn?

potent island
cedar turret
#

Hi everyone. I’m looking for suggestions on tackling this problem. I have about a 100,000 unlabeled job description data that I’m trying to use to determine the category of job. For example, from a job description text I want to know if it’s in IT, Software, Admin/Clerical etc. I tried using pre trained models from hugging face transformers but it didn’t work well. I have thought about labeling the data but it would take time to do it for a 100,000.

rancid kestrel
#

Has anybody tried to include the "keyword" and "location" columns into your model? All the notebooks I looked at so far didn't include these columns. Anyway, if you did include them, how did you encode them? The "keyword" column has ~222 unique values and the "location" column has ~3341 unique values. I don't think one-hot-encoding makes sense in this case. Any thoughts?

reef hull
#

@tawny shore please don't share irrelevant content

bitter birch
#

is there anyone doing this project right now? I wanna join.

graceful sleet
#

Hii