Inferencer Overviewο
Inferencer is the core component in AISBench responsible for executing model inference. It connects datasets, retrievers (Retriever), and models, and is responsible for sending processed prompts to models for inference and collecting and managing inference results.
Core Functionsο
Inferencer undertakes the following core responsibilities in AISBenchβs evaluation workflow:
Data Preparation: Get data list from retriever (Retriever), including input prompts, ground truth, and other information
Model Invocation: Adopt different methods to call models for inference according to model type (API models or local models)
API Models: Call service inference interfaces through async HTTP requests
Local Models: Directly call locally loaded models for batch inference
Result Management: Collect, process, and save inference results, including:
Model-generated text content
Inference status (success/failure)
Performance metrics (such as latency, throughput, etc., in performance mode)
Error information (if inference fails)
Status Tracking: In performance evaluation mode, track and statistics request status, including:
Number of sent requests (post)
Number of received responses (rev)
Number of failed requests (failed)
Number of completed requests (finish)
Architecture Designο
Inferencer adopts a layered design, including the following base classes:
BaseInferencer: Base class for all inferencers, providing common functions such as model building and output processing
BaseApiInferencer: Base class for API model inferencers, providing async request processing, status tracking, and other functions
BaseLocalInferencer: Base class for local model inferencers, providing batch inference, data loading, and other functions
Inferencers can inherit from both BaseApiInferencer and BaseLocalInferencer as needed to support both API models and local models.
Currently Supported Inferencer Typesο
AISBench currently supports the following inferencer types:
1. GenInferencer (Generative Inferencer)ο
Function: Inferencer for generative tasks, supporting text generation, question answering, and other tasks.
Features:
Supports both API models and local models
Supports streaming and non-streaming inference
Supports performance evaluation mode
Supports custom stopping criteria (stopping_criteria)
Use Cases:
Text generation tasks
Question answering tasks
Code generation tasks
Mathematical reasoning tasks
Implementation File: icl_gen_inferencer.py
2. MultiTurnGenInferencer (Multi-turn Dialogue Inferencer)ο
Function: Inferencer for multi-turn dialogue tasks, supporting multi-turn interactive dialogue scenarios.
Features:
Supports both API models and local models
Supports multiple inference modes:
every: Round-by-round inference, using modelβs previous round output as next round inputlast: Only infer the last roundevery_with_gt: Round-by-round inference, but using ground truth instead of model output
Supports performance evaluation mode
Use Cases:
Multi-turn dialogue tasks
Tasks requiring contextual interaction
Conversational question answering tasks
Implementation File: icl_multiturn_inferencer.py
3. PPLInferencer (Perplexity Inferencer)ο
Function: Inferencer for Perplexity evaluation, selecting answers by calculating the perplexity of each option, mainly used for multiple choice question (MCQ) tasks.
Features:
Only supports API models (does not support local models)
Does not support streaming inference
Does not support performance evaluation mode
Selects the option with the lowest perplexity as the prediction result by calculating the perplexity of each candidate answer
Use Cases:
Multiple choice question (MCQ) tasks
Classification tasks requiring selection based on perplexity
Implementation File: ppl_inferencer.py
4. BFCLV3FunctionCallInferencer (Function Call Inferencer)ο
Function: Inferencer for function call tasks, supporting scenarios where models call external functions or tools.
Features:
Only supports API models
Supports multi-turn function calls
Supports holdout function mechanism
Supports result processing and feedback for function calls
Use Cases:
Function call tasks
Tool usage tasks
Tasks requiring models to call external APIs
Implementation File: icl_bfcl_v3_inferencer.py
Inferencer Selection Guideο
Select the appropriate inferencer according to different task types and model types:
Task Type |
Model Type |
Recommended Inferencer |
|---|---|---|
Text generation, Question answering |
API models |
GenInferencer |
Text generation, Question answering |
Local models |
GenInferencer |
Multi-turn dialogue |
API models |
MultiTurnGenInferencer |
Multi-turn dialogue |
Local models |
MultiTurnGenInferencer |
Multiple choice questions (MCQ) |
API models |
PPLInferencer |
Function calls |
API models |
BFCLV3FunctionCallInferencer |
Further Readingο
Supporting New Inferencers: Learn how to implement custom inferencers
Prompt Template: Learn about prompt template definitions
Meta Template: Learn about model meta template definitions
Retriever: Learn about how retrievers work