既有标点,又分离多说话人,可惜只有python的,rtf只有0.5,您的c++方案rtf接近0.1了 https://www.modelscope.cn/models/damo/speech_paraformer-large-vad-punc-spk_asr_nat-zh-cn/summary