Results
Results on ASR Tasks
Models |
MDRM-test |
SPGISpeech-test |
Earning-21 |
Earning-22 |
|---|---|---|---|---|
Whisper-v3 |
2.14 |
2.88 |
11.85 |
15.93 |
Qwen2-Audio-7B |
3.97 |
4.42 |
26.06 |
42.76 |
Qwen2-Audio-7B-Instruct |
4.68 |
5.74 |
29.58 |
33.65 |
SALMONN-7B |
51.52 |
39.51 |
83.20 |
88.50 |
SALMONN-13B |
49.17 |
41.17 |
80.54 |
86.23 |
Gemini-1.5-flash |
4.850 |
5.802 |
18.58 |
27.13 |
Gemini-2.0-flash |
4.321 |
5.143 |
19.17 |
28.12 |
GPT-4o-audio-transcribe |
4.23 |
4.66 |
15.78 |
21.37 |
Results on summarization task
Metrics |
Whisper-v3 |
Qwen2-Audio-7B-Instruct |
Gemini-1.5-flash |
GPT-4o-audio |
|---|---|---|---|---|
Rouge-L |
0.053 |
0.048 |
0.072 |
0.063 |
BertScore |
0.514 |
0.467 |
0.553 |
0.508 |