It’s quite unusual for that to happen unless you are using the Multilingual v2 with Style greater than 0 (0 turns the Style effect off entirely) or have Stability very low.
What, if any, Style and Stability values might introduce issues varies depending on the particular voice. I find Stability of 40% or higher is generally safe for long form content with most voices. A lot of voices can handle a lower Style value without much issue, some voices can handle high Style values.