📈 Performance Dashboard

Contents

📈 Performance Dashboard#

Overview#

To better visualize the performance of the SpecBundle draft models, we have built a dashboard to offer interactive experiences to users to explore the evaluation results. We evaluate the performance of SpecBundle draft models under different speculative decoding configurations (i.e. steps, topk, num_draft_tokens) on various benchmarks, the benchmarks include:

  • Conversation

    • MTBench

  • General Knowledge

    • GPQA

    • FinanceQA

  • Math

    • GSM8K

    • Math500

  • Coding

    • HumanEval

    • LiveCodeBench

Check out the Performance Dashboard for more details.