AISBench Benchmark Tool

🚀 Get Started

  • Tool Installation & Uninstallation
  • Quick Start
  • Dataset Preparation Guide

🧭 Basic Tutorials

  • Supported Evaluation Scenarios
  • Explanation of Evaluation Results
  • Detailed Parameter Description
    • User Configuration Parameters
    • ### Common Parameters
    • ### Accuracy Evaluation Parameters
    • ### Performance Evaluation Parameters
    • Model Configuration Instructions
    • Supported Result Summary Tasks
    • Explanation of Running Modes

🔬 Advanced Tutorials

  • Running AISBench with a Custom Configuration File
  • Service-Oriented Steady-State Performance Testing
  • Request Sending Rate (RPS) Distribution Control and Visualization Guide
  • Guide to Multi-Turn Dialogue Evaluation
  • Guide to Using Random Synthetic Datasets
  • Guide to Using Custom Datasets
  • Evaluation Using Judge Model

📐 Extended Benchmarks

  • Extended Multimodal Generation Benchmarks

💪 Best Practices

  • Evaluating the Mathematical Capabilities of DeepSeek-R1-Distill-Qwen-14B Based on NVIDIA A100 Accelerator Card: 100% Paper Reproduction
  • Evaluating DeepSeek-R1’s Mathematical Capabilities Based on Ascend 800I-A2: 100% Paper Reproduction
  • Reproducing Dataset Evaluation Results from Large Language Model (LLM) Papers (Technical Reports) — Taking the GPQA Dataset Used by DeepSeek R1 as an Example

❓ FAQs

  • AISBench FAQ (Frequently Asked Questions)
  • Error Code Description

👨‍💻 Developer Guide

  • Contributing Guide
  • Supporting New Model Backends
  • Supporting New Datasets and Accuracy Evaluators
  • Supporting New Inferencers

📝 Prompt Engineering

  • Prompt Template
  • Meta Template
  • Prompt Overview
  • Retriever

🏷️ Others

  • 🔜 Coming Soon
  • 🤝 Acknowledgments
AISBench Benchmark Tool
  • Detailed Parameter Description
  • View page source

Detailed Parameter Description

  • User Configuration Parameters
    • Command Line Parameters
  • ### Common Parameters
  • ### Accuracy Evaluation Parameters
  • ### Performance Evaluation Parameters
    • Configuration Constant File Parameters
  • Model Configuration Instructions
    • Service-Oriented Inference Backend
    • Local Model Backend
  • Supported Result Summary Tasks
  • Explanation of Running Modes
    • Accuracy Evaluation Scenarios
    • Performance Evaluation Scenarios
Previous Next

© Copyright 2025, AISBench AI System Performance Evaluation Benchmark Committee.

Built with Sphinx using a theme provided by Read the Docs.