Results

Results on ASR Tasks

Models

MDRM-test

SPGISpeech-test

Earning-21

Earning-22

Whisper-v3

2.14

2.88

11.85

15.93

Qwen2-Audio-7B

3.97

4.42

26.06

42.76

Qwen2-Audio-7B-Instruct

4.68

5.74

29.58

33.65

SALMONN-7B

51.52

39.51

83.20

88.50

SALMONN-13B

49.17

41.17

80.54

86.23

Gemini-1.5-flash

4.850

5.802

18.58

27.13

Gemini-2.0-flash

4.321

5.143

19.17

28.12

GPT-4o-audio-transcribe

4.23

4.66

15.78

21.37

Results on summarization task

Metrics

Whisper-v3

Qwen2-Audio-7B-Instruct

Gemini-1.5-flash

GPT-4o-audio

Rouge-L

0.053

0.048

0.072

0.063

BertScore

0.514

0.467

0.553

0.508