The error KeyError: 'input_features' in your STT (speech-to-text) notebook occurs because the data collator expects each feature to have an "input_features" key, but at least one item in your dataset does not contain it. This usually means your dataset preprocessing step did not correctly generate or include "input_features" for all examples, or the formatting function was not applied to the dataset before training.
To fix this, ensure you preprocess your dataset with a function that extracts audio features and creates the "input_features" and "labels" keys for each example, as shown in the official Unsloth Whisper notebook. See the relevant code and explanation in Kaggle-Whisper.py and Kaggle-Whisper.ipynb. Would you like a step-by-step guide on how to apply this preprocessing?
Sources: