#pii-detection-removal-from-educational-data
1 messages · Page 1 of 1 (latest)
Gonna do soon
yes me
ok
Any idea how we will do this project?
Yes me, i have started, you have to custom your own token classification model
Folks. I am new to Kaggle and I am looking at pii-detection-removal-from-educational-data and a notebook posted by Qamar Math. I am trying to see how performance is measured. Any idea, how this is done?
Check this out and see if this helps: https://huggingface.co/docs/transformers/en/tasks/token_classification#evaluate
I am baffled with the data description. Why are majority of the essays reserved for the test set (70%)? Where can I get more external datasets that are publicly available to bolster the training data? Can I use the provided train data (train.json - 109.5 MB) and test data (test.json - 155.69 kB) for training and testing purpose? Very much appreciated if someone can give me some guidance as I am a beginner. Thank you.
A question: is creating a model hardcore requirement or I can rely on writing other methods to achieve the desired outcome?