Hey. I'm ruzin, I'm the lead maintanter of StenoAI, a privacy focused AI meeting notetaker. It keeps your data fully private by using locally hosted small language models for transcription and summarisation. https://stenoai.co
Anyways, one of the issues I have really struggled with is getting more performance from my locally hosted deepseek 7b parameter model, I have hit the limits of prompt engineering. Atm, my summaries are getting to 60%, maybe 70% of the quality of an Large language model which is crazy but there are still reliability and accuracy issues!
So if anyone has any ideas or experience, please let me know.