Welcome to AISBench Benchmark Tool English Tutorial βœ¨οƒ

🌏 Introduction

AISBench Benchmark is a model evaluation tool built on OpenCompass, compatible with OpenCompass’s configuration system, dataset structure, and model backend implementation, while extending support for service-based models.

Currently, AISBench supports two major types of inference task evaluation scenarios:

πŸ” Accuracy Evaluation: Supports accuracy verification of service-based models and local models on various question-answering and reasoning benchmark datasets.

πŸš€ Performance Evaluation: Supports the assessment of latency and throughput for service-oriented models, and enables extreme performance testing under stress test scenarios.