Hi everyone! @here
This is Sushmit, one of the hosts of this competition. Really nice to see so many people working on this challenge. I, along with my teammates from Bengali.AI are available here and in the Kaggle discussion threads to answers any of your questions / concerns. Feel free to reach out!
Some pointers:
Bengali orthography is complex due to the use of diacritics and connectors. We hosted another Kaggle competition in 2019 on a dataset based on this issue. Ref: https://arxiv.org/ftp/arxiv/papers/2010/2010.00170.pdf, https://www.kaggle.com/c/bengaliai-cv19
Also there's the issue of unicode ambiguities in Bengali (there are multiple ways of writing the same thing ). We provide a python module to deal with this: https://arxiv.org/pdf/2306.01743.pdf
Insights regarding some out-of-distribution domains: https://arxiv.org/ftp/arxiv/papers/2305/2305.09688.pdf
Classify the components of handwritten Bengali