"If you can’t measure it, you can’t improve it." -- Peter Drucker
SpeechIO leaderboard serves as an ASR benchmarking platform by providing 3 components:
-
TestSet Zoo: A collection of test sets covering wide range of speech recognition tasks & scenarios
-
Model Zoo: A collection of models including commercial APIs & open-sourced models
-
Benchmarking Pipeline: a simple & well-specified pipeline to take care of data preparation / recognition / post processing / error rate evaluation.
People should be able to easily benchmark, reproduce, examine ASR systems from each other
Academic Test Sets (EN & ZH)
| 已公开 UNLOCKED | 编号 DATASET_ID | 说明 DESCRIPTION | 语言 LANGUAGE |
|---|---|---|---|
| ✓ | AISHELL1_TEST | test set of AISHELL-1 | zh |
| ✓ | AISHELL2_IOS_TEST | test set of AISHELL-2 (iOS channel) | zh |
| ✓ | AISHELL2_ANDROID_TEST | test set of AISHELL-2 (Android channel) | zh |
| ✓ | AISHELL2_MIC_TEST | test set of AISHELL-2 (Microphone channel) | zh |
| ✓ | ALIMEETING_EVAL_NEAR_FIELD | AliMeeting | zh |
| ✓ | ALIMEETING_TEST_NEAR_FIELD | AliMeeting | zh |
| ✓ | ALIMEETING_EVAL_FAR_FIELD | AliMeeting | zh |
| ✓ | ALIMEETING_TEST_FAR_FIELD | AliMeeting | zh |
| ✓ | LIBRISPEECH_TEST_CLEAN | "test_clean" set of LibriSpeech | en |
| ✓ | LIBRISPEECH_TEST_OTHER | "test_other" set of LibriSpeech | en |
| ✓ | TEDLIUM_RELEASE3_LEGACY_DEV | tedlium release 3, legacy dir dev set TEDLium3 | en |
| ✓ | TEDLIUM_RELEASE3_LEGACY_TEST | tedlium release 3, legacy dir test set TEDLium3 | en |
| ✓ | GIGASPEECH_V1.0.0_DEV | dev set of GigaSpeech | en |
| ✓ | GIGASPEECH_V1.0.0_TEST | test set of GigaSpeech | en |
| ✓ | VOXPOPULI_V1.0_EN_DEV | dev set of VoxPopuli | en |
| ✓ | VOXPOPULI_V1.0_EN_TEST | test set of VoxPopuli | en |
| ✓ | VOXPOPULI_V1.0_EN_ACCENTED_TEST | accented test set of VoxPopuli | en |
| ✓ | COMMON_VOICE_V11.0_DEV | dev set of Common Voice | en |
| ✓ | COMMON_VOICE_V11.0_TEST | test set of Common Voice | en |
SpeechIO Test Sets (ZH)
SpeechIO test sets are carefully curated by SpeechIO authors, crawled from publicly available sources (Youtube, TV programs, Podcast etc), covering various well-known scenarios and topics, transcribed by payed professional annotators. | 已公开 UNLOCKED | 编号 DATASET_ID | 名称 NAME | 场景 SCENARIO | 内容领域 TOPIC | 有效时长 DURATION (HOURS) | 难度(1-5) DIFFICULTY |
|---|---|---|---|---|---|---|
| ✓ | SPEECHIO_ASR_ZH00000 | 调试集 for debugging | 视频会议、论坛演讲 conference & speech | 经济、货币、金融 economy, currency, finance | 1.0 | ★★☆ |
| ✓ | SPEECHIO_ASR_ZH00001 | 新闻联播 | 新闻播报 TV News | 时政 news & politics | 9 | ★ |
| ✓ | SPEECHIO_ASR_ZH00002 | 鲁豫有约 | 访谈电视节目 TV interview | 名人工作/生活 celebrity & film & music & daily | 3 | ★★☆ |
| ✓ | SPEECHIO_ASR_ZH00003 | 天下足球 | 专题电视节目 TV program | 足球 Sports & Football & Worldcup | 2.7 | ★★☆ |
| ✓ | SPEECHIO_ASR_ZH00004 | 罗振宇跨年演讲 | 会场演讲 Stadium Public Speech | 社会、人文、商业 Society & Culture & Business Trend | 2.7 | ★★ |
| ✓ | SPEECHIO_ASR_ZH00005 | 李永乐讲堂 | 在线教育 Online Education | 科普 Popular Science | 4.4 | ★★★ |
| ✓ | SPEECHIO_ASR_ZH00006 | 王者荣耀 张大仙 & 骚白 | 直播 Live Broadcasting | 游戏 Game | 1.6 | ★★★☆ |
| ✓ | SPEECHIO_ASR_ZH00007 | 直播带货 李佳琪 & 薇娅 | 直播 Live Broadcasting | 电商、美妆 Makeup & Online shopping/advertising | 0.9 | ★★★★☆ |
| ✓ | SPEECHIO_ASR_ZH00008 | 老罗语录 | 线下培训 Offline lecture | 段子、做人 Life & Purpose & Ethics | 1.3 | ★★★★☆ |
| ✓ | SPEECHIO_ASR_ZH00009 | 故事FM | 播客 Podcast | 人生故事、见闻 Ordinary Life Story Telling | 4.5 | ★★☆ |
| ✓ | SPEECHIO_ASR_ZH00010 | 创业内幕 | 播客 Podcast | 创业、产品、投资 Startup & Enterprenuer & Product & Investment | 4.2 | ★★☆ |
| ✓ | SPEECHIO_ASR_ZH00011 | 罗翔刑法法考 | 在线教育 Online Education | 法律 法考 Law & Lawyer Qualification Exams | 3.4 | ★★☆ |
| ✓ | SPEECHIO_ASR_ZH00012 | 张雪峰考研 | 在线教育 Online Education | 考研 高校报考 University & Graduate School Entrance Exams | 3.4 | ★★★☆ |
| ✓ | SPEECHIO_ASR_ZH00013 | 谷阿莫 牛叔说电影 | 短视频 VLog | 电影剪辑 Movie Cuts | 1.8 | ★★★ |
| ✓ | SPEECHIO_ASR_ZH00014 | 贫穷料理 琼斯爱生活 | 短视频 VLog | 美食、烹饪 Food & Cooking & Gourmet | 1 | ★★★☆ |
| ✓ | SPEECHIO_ASR_ZH00015 | 单田芳 白眉大侠 | 评书 Traditional Podcast | 江湖、武侠 Kongfu Fiction | 2.2 | ★★☆ |
| ✓ | SPEECHIO_ASR_ZH00016 | 德云社演出 | 剧场相声 Theater Crosstalk Show | 包袱段子 Funny Stories | 1 | ★★★ |
| ✓ | SPEECHIO_ASR_ZH00017 | 吐槽大会 | 脱口秀电视节目 Standup Comedy | 明星糗事 Celebrity Jokes | 1.8 | ★★☆ |
| ✓ | SPEECHIO_ASR_ZH00018 | 小猪佩奇 熊出没 | 少儿动画 Children Cartoon | 童话故事、日常 Fairy Tale | 0.9 | ★☆ |
| ✓ | SPEECHIO_ASR_ZH00019 | CCTV5 NBA 转播 | 体育赛事解说 Sports Game Live | 篮球、NBA NBA Game | 0.7 | ★★★ |
| ✓ | SPEECHIO_ASR_ZH00020 | 篮球人物 | 纪录片 Documentary | 篮球明星、成长 NBA Super Stars' Life & History | 2.2 | ★★ |
| ✓ | SPEECHIO_ASR_ZH00021 | 汽车之家评测 | 短视频 VLog | 汽车测评 Car benchmarks, Road driving test | 1.7 | ★★★☆ |
| ✓ | SPEECHIO_ASR_ZH00022 | 小艾大叔 豪宅带看 | 短视频 VLog | 房地产、豪宅 Realestate, Mansion tour | 1.7 | ★★★ |
| ✓ | SPEECHIO_ASR_ZH00023 | 无聊开箱 Zealer评测 | 短视频 VLog | 产品开箱评测 Unboxing | 2 | ★★★ |
| ✓ | SPEECHIO_ASR_ZH00024 | 付老师种植技术 | 短视频 VLog | 农业、种植 Agriculture, Planting | 2.7 | ★★★☆ |
| ✓ | SPEECHIO_ASR_ZH00025 | 石国鹏讲历史 | 线下培训 Offline lecture | 历史,古希腊哲学 History, Greek philosophy | 1.3 | ★★☆ |
| ✓ | SPEECHIO_ASR_ZH00026 | 张震鬼故事 | 广播节目 Broadcasting Program | 鬼故事 Horror Stories | 2.4 | ★★★ |
| ✗ | SPEECHIO_ASR_ZH00027 | 华语辩论世界杯 | 辩论赛 Debates Contest | 兴趣、技能、成长 Hobby, Skill, Growth | 1.4 | ★★★ |
| ✗ | SPEECHIO_ASR_ZH00028 | 时政现场同传 | 同声传译 Simultaneous Translation | 时政、社会公共治理 News & Events on Public Governance | 2.1 | ★★★☆ |
| ✗ | SPEECHIO_ASR_ZH00029 | 港台明星访谈 周杰伦,曾志伟 张家辉,陈小春 周星驰 | 口音(港台) HongKong/Taiwan Accents | 娱乐、生活、演艺 Entertainment, Acting, Musics | 1.5 | ★★★☆ |
| ✗ | SPEECHIO_ASR_ZH00030 | 世界青年说 | 口音(老外) Foreigner Accents | 异国文化比较 Cultural Difference | 2 | ★★★☆ |
| ✗ | SPEECHIO_ASR_ZH00031 | 东方甄选 | 直播 broadcast | 带货,英语教学 Online advertising & English Education | 2.4 | ★★★☆ |
| ✗ | SPEECHIO_ASR_ZH00032 | 郎朗钢琴课 | 长视频 long-form video | 音乐乐理,钢琴 Music & piano | 1.7 | ★★☆ |
| ✗ | SPEECHIO_ASR_ZH00033 | 老石谈芯 | 短视频 VLog | 芯片 chips | 2.8 | ★★★ |
| ✗ | SPEECHIO_ASR_ZH00034 | 电丸科技AK | 短视频 VLog | 网络 IT Internet tech, IT | 1.4 | ★★★☆ |
| ✗ | SPEECHIO_ASR_ZH00035 | 新氧医美 | 短视频 VLog | 医疗美容 Medical Cosmetology | 1.4 | ★★ |
| ✗ | SPEECHIO_ASR_ZH00036 | 交通广播 | 交通广播 traffic radio | 路况,娱乐 Traffics | 1.2 | ★★★☆ |
| ✗ | SPEECHIO_ASR_ZH00037 | 老俞闲聊 | 在线会议 Online meeting | 闲聊 chat | 2.4 | ★★★ |
| ✗ | SPEECHIO_ASR_ZH00038 | 电影:疯狂石头+疯狂赛车 | 电影 Film | 重庆话、山东青岛、四川成都话、河北唐山话、粤语、天津话、河南话、陕西话、闽南话,武汉话等 multiple accents | 1.3 | ★★★★☆ |
| ✗ | SPEECHIO_ASR_ZH00039 | 电影:1942 | 电影 Film | 河南话 HeNan Accent | 0.9 | ★★★★ |
| ✗ | SPEECHIO_ASR_ZH00040 | 电影:白鹿原 | 电影 Film | 陕西话 ShaanXi Accent | 1.1 | ★★★★★ |
| ✗ | SPEECHIO_ASR_ZH00041 | 电影:让子弹飞 | 电影 Film | 四川话 SiChuan Accent | 1.1 | ★★★★☆ |
| ✗ | SPEECHIO_ASR_ZH00042 | 电影:人生大事 | 电影 Film | 武汉话 WuHan Accent | 0.8 | ★★★★ |
| ✗ | SPEECHIO_ASR_ZH00043 | 听障 | 听障语音识别 Hearing Imperiment Speaker | 新闻脚本 News Prompts | 0.6 | ★★★★★ |
| ✗ | SPEECHIO_ASR_ZH00044 | 唐诗宋词 | 诗词朗诵 Poems Reading | 唐诗宋词 Chinese Poems | 1.1 | ★★★☆ |
| ✗ | SPEECHIO_ASR_ZH00045 | 文言文 | 文言文朗诵 Classical Chinese Reading | 论语,老子,诗经,孙子兵法 | 0.5 | ★★★★★ |
| ✗ | SPEECHIO_ASR_ZH00046 | 音乐歌词识别 | 演唱 Singing | 歌词 Lyrics | 1.2 | ★★★★☆ |
EN Models
| 编号 MODEL_ID | 类型 TYPE | 厂商/作者 PROVIDER/AUTHOR | 简介 DESCRIPTION | 链接 URL |
|---|---|---|---|---|
| aliyun_api_en | Cloud | Alibaba | link | |
| amazon_api_en | Cloud | Amazon AWS | link | |
| baidu_api_en | Cloud | Baidu | link | |
| google_api_en | Cloud | link | ||
| google_USM_en | Cloud | request access | ||
| microsoft_sdk_en | Cloud | Microsoft Azure | link | |
| tencent_api_en | Cloud | Tencent | link | |
| coqui_model_en | Local | coqui | link | |
| deepspeech_model_en | Local | deepspeech | link | |
| k2_gigaspeech | Local | k2-fsa | link | |
| nemo_conformer_ctc_large_en | Local | NVidia NeMo | link | |
| nemo_conformer_transducer_xlarge_en | Local | NVidia NeMo | link | |
| vosk_model_en | Local | alphacephei | link | |
| vosk_model_en_large | Local | alphacephei | link | |
| whisper_large | Local | OpenAI | link | |
| whisper_large_v2 | Local | OpenAI | link | |
| data2vec_audio_large_ft_libri_960h | Local | Facebook AI | link | |
| hubert_xlarge_ft_libri_960h | Local | Facebook AI | link | |
| wav2vec2_large_robust_ft_libri_960h | Local | Facebook AI | link | |
| wavlm_base_plus_ft_libri_clean_100h | Local | Microsoft patrickvonplaten | link |
ZH Models
Cloud Models
| 编号 MODEL_ID | 类型 TYPE | 厂商 PROVIDER | 简介 DESCRIPTION | 链接 URL |
|---|---|---|---|---|
| aispeech_api_zh | Cloud | 思必驰 AISpeech | 思必驰开放平台 | link |
| aliyun_api_zh | Cloud | 阿里巴巴 Alibaba | 阿里云 - 一句话识别 | link |
| aliyun_ftasr_api_zh | Cloud | 阿里巴巴 Alibaba | 阿里云 - 文件识别(非流式) | link |
| baidu_pro_api_zh | Cloud | 百度 Baidu | 百度智能云 (极速版) | link |
| bilibili_api_zh | Cloud | 哔哩哔哩 bilibili | 哔哩哔哩AI开放平台 | not available yet |
| ximalaya_api_zh | Cloud | 喜马拉雅 ximalaya | 喜马拉雅AI开放平台 (转写,非流式) | link |
| iflytek_lfasr_api_zh | Cloud | 讯飞 IFlyTek | 讯飞开放平台 (转写,非流式) | link |
| microsoft_sdk_zh | Cloud | 微软 Microsoft | Azure (流式) | link |
| microsoft_batch_zh | Cloud | 微软 Microsoft | Azure (离线转写) | link |
| tencent_api_zh | Cloud | 腾讯 Tencent | 腾讯云 | link |
| yitu_api_zh | Cloud | 依图 YituTech | 依图语音开放平台 | link |
Local Models
| 编号 MODEL_ID | 类型 TYPE | 作者 AUTHOR | 简介 DESCRIPTION |
|---|---|---|---|
| speechio_kaldi_multicn | Local | Xingyu NA(那兴宇) | Kaldi multi_cn recipe |
| vosk_model_cn | Local | alphacephei | Chinese engine of Vosk |
| paraformer_large_offline_zh | Local | modelscope | Paraformer, default Chinese 16k model, offline, support long-form audio recognition |
Follow this specification. Existing models are good references as well.
| Rank 排名 | Model 模型 | CER 字错误率 | Date 时间 |
|---|---|---|---|
| 1 | ximalaya_api_zh | 1.72% | 2025.01 |
| 2 | aliyun_ftasr_api_zh | 1.80% | 2025.01 |
| 3 | microsoft_batch_zh | 1.95% | 2025.01 |
| 4 | iflytek_lfasr_api_zh | 3.01% | 2025.01 |
| 5 | tencent_api_zh | 3.20% | 2025.01 |
| 6 | aispeech_api_zh | 3.61% | 2025.01 |
| 7 | baidu_pro_api_zh | 7.30% | 2025.01 |
| Rank 排名 | Model 模型 | CER 字错误率 | Date 时间 |
|---|---|---|---|
| 1 | microsoft_batch_zh | 5.26% | 2025.01 |
| 2 | ximalaya_api_zh | 6.89% | 2025.01 |
| 3 | aliyun_ftasr_api_zh | 6.92% | 2025.01 |
| 4 | tencent_api_zh | 7.81% | 2025.01 |
| 5 | iflytek_lfasr_api_zh | 8.70% | 2025.01 |
| 6 | aispeech_api_zh | 10.42% | 2025.01 |
| 7 | baidu_pro_api_zh | 16.23% | 2025.01 |
| Rank 排名 | Model 模型 | CER 字错误率 | Date 时间 |
|---|---|---|---|
| 1 | microsoft_batch_zh | 2.99% | 2025.01 |
| 2 | ximalaya_api_zh | 3.35% | 2025.01 |
| 3 | aliyun_ftasr_api_zh | 3.40% | 2025.01 |
| 4 | tencent_api_zh | 4.64% | 2025.01 |
| 5 | iflytek_lfasr_api_zh | 4.80% | 2025.01 |
| 6 | aispeech_api_zh | 5.75% | 2025.01 |
| 7 | baidu_pro_api_zh | 10.10% | 2025.01 |
| Model 模型 | CER 字错误率 | Date 时间 |
|---|---|---|
| bilibili_api_zh(*) | 2.49% | 2025.01 |
| Model 模型 | CER 字错误率 | Date 时间 |
|---|---|---|
| bilibili_api_zh(*) | 5.56% | 2025.01 |
| Model 模型 | CER 字错误率 | Date 时间 |
|---|---|---|
| bilibili_api_zh(*) | 3.45% | 2025.01 |
Detail all results (字错误率 CER %)
| Test Set ID | 测试场景&内容领域 | bilibili_api_zh | Date 时间 |
|---|---|---|---|
| SPEECHIO_ASR_ZH00001 | 新闻联播 | 0.53 | 2025.01 |
| SPEECHIO_ASR_ZH00002 | 访谈 | 2.83 | 2025.01 |
| SPEECHIO_ASR_ZH00003 | 电视节目 | 0.97 | 2025.01 |
| SPEECHIO_ASR_ZH00004 | 场馆演讲 | 1.59 | 2025.01 |
| SPEECHIO_ASR_ZH00005 | 在线教育 | 1.45 | 2025.01 |
| SPEECHIO_ASR_ZH00006 | 直播 | 5.76 | 2025.01 |
| SPEECHIO_ASR_ZH00007 | 直播 | 6.40 | 2025.01 |
| SPEECHIO_ASR_ZH00008 | 线下培训 | 3.69 | 2025.01 |
| SPEECHIO_ASR_ZH00009 | 播客 | 3.18 | 2025.01 |
| SPEECHIO_ASR_ZH00010 | 播客 | 3.48 | 2025.01 |
| SPEECHIO_ASR_ZH00011 | 在线教育 | 1.78 | 2025.01 |
| SPEECHIO_ASR_ZH00012 | 在线教育 | 2.13 | 2025.01 |
| SPEECHIO_ASR_ZH00013 | 短视频 | 3.03 | 2025.01 |
| SPEECHIO_ASR_ZH00014 | 短视频 | 3.47 | 2025.01 |
| SPEECHIO_ASR_ZH00015 | 评书 | 4.83 | 2025.01 |
| SPEECHIO_ASR_ZH00016 | 相声 | 3.04 | 2025.01 |
| SPEECHIO_ASR_ZH00017 | 脱口秀 | 2.82 | 2025.01 |
| SPEECHIO_ASR_ZH00018 | 少儿卡通 | 1.96 | 2025.01 |
| SPEECHIO_ASR_ZH00019 | 体育赛事解说 | 2.29 | 2025.01 |
| SPEECHIO_ASR_ZH00020 | 纪录片 | 1.55 | 2025.01 |
| SPEECHIO_ASR_ZH00021 | 短视频 | 1.69 | 2025.01 |
| SPEECHIO_ASR_ZH00022 | 短视频 | 3.47 | 2025.01 |
| SPEECHIO_ASR_ZH00023 | 短视频 | 2.14 | 2025.01 |
| SPEECHIO_ASR_ZH00024 | 短视频 | 4.70 | 2025.01 |
| SPEECHIO_ASR_ZH00025 | 线下课堂 | 3.14 | 2025.01 |
| SPEECHIO_ASR_ZH00026 | 广播电台节目 | 3.63 | 2025.01 |
| SPEECHIO_ASR_ZH00027 | 华语大学生辩论赛 | 2.03 | 2025.01 |
| SPEECHIO_ASR_ZH00028 | 同声传译:时政&社会公共治理 | 2.04 | 2025.01 |
| SPEECHIO_ASR_ZH00029 | 港台口音:港台明星访谈 | 3.87 | 2025.01 |
| SPEECHIO_ASR_ZH00030 | 老外口音:《世界青年说》 | 3.86 | 2025.01 |
| SPEECHIO_ASR_ZH00031 | 直播带货 | 3.74 | 2025.01 |
| SPEECHIO_ASR_ZH00032 | 音乐 | 3.86 | 2025.01 |
| SPEECHIO_ASR_ZH00033 | 芯片 | 2.45 | 2025.01 |
| SPEECHIO_ASR_ZH00034 | 网络IT | 5.10 | 2025.01 |
| SPEECHIO_ASR_ZH00035 | 新氧医美 | 1.13 | 2025.01 |
| SPEECHIO_ASR_ZH00036 | 交通广播 | 6.01 | 2025.01 |
| SPEECHIO_ASR_ZH00037 | 在线会议聊天 | 3.02 | 2025.01 |
| SPEECHIO_ASR_ZH00038 | 电影:疯狂石头+疯狂赛车(方言杂烩) | 18.36 | 2025.01 |
| SPEECHIO_ASR_ZH00039 | 电影:1942(河南话) | 13.92 | 2025.01 |
| SPEECHIO_ASR_ZH00040 | 电影:白鹿原(陕西话) | 25.80 | 2025.01 |
| SPEECHIO_ASR_ZH00041 | 电影:让子弹飞(四川话) | 11.37 | 2025.01 |
| SPEECHIO_ASR_ZH00042 | 电影:人生大事(武汉话) | 18.24 | 2025.01 |
| SPEECHIO_ASR_ZH00043 | 听障 | 23.34 | 2025.01 |
| SPEECHIO_ASR_ZH00044 | 诗词 | 1.64 | 2025.01 |
| SPEECHIO_ASR_ZH00045 | 文言文 | 4.22 | 2025.01 |
| SPEECHIO_ASR_ZH00046 | 歌词 | 9.60 | 2025.01 |
note: models with (*) marker can be found in model zoo, but not universally available to public yet.
Email: leaderboard@speechio.ai





