This addresses the unnatural gestures models making its suffering seizures due to frequent keyword triggers.
Please adapt the A Large-Scale Semantic and Emotional Multi-Modal Dataset for Conversational Gestures Synthesis. Everything is already available at pantomatrix.github.io/BEAT/
This works out-of-the-box with FBX 3d model, you just need to use some values to trigger slider values in Live 2D
https://www.youtube.com/watch?v=F6nXVTUY0KQ
Also for song performance, reduce the number of movements, I understand it's trying to follow the bpm, but this looks like seizures it its doing it the entire song, instead fine-tune which slider to trigger based [low, mid, high] beats (example use low beats 1:3 at a time for rhythmic tapps theb high beats 1:10 at a time for eyes expression ), you can find the code here :
github.com/maxemitchell/beat-detection-python the dev speaks about here https://www.youtube.com/watch?v=frDQFCRFMB8 other audio visualizer in github are also a good source of code.
More codes:
- https://github.com/ai4r/Gesture-Generation-from-Trimodal-Context
- https://github.com/TheTempAccount/Co-Speech-Motion-Generation
-- WIP --
Paper (ECCV 2022): https://www.ecva.net/papers/eccv_2022/papers_ECCV/papers/136670605.pdf
Project Page: https://pantomatrix.github.io/BEAT/
Institution: Huawei Technologies, The University of Tokyo, Keio University, JAIST.
We present a new conversational gestures dataset (BEAT) with cascaded motion network (CaMN) model as a baseline for synthe...