Quick Start
User Interaction Flow
AccuracyEvaluator
Compare output against expected results.Builder Methods
| Method | Signature | Description |
|---|---|---|
new() | fn new() -> AccuracyEvaluatorBuilder | Create builder |
input(text) | fn input(impl Into<String>) -> Self | Set input |
expected(text) | fn expected(impl Into<String>) -> Self | Set expected output |
threshold(n) | fn threshold(f64) -> Self | Pass threshold (0.0-1.0) |
build() | fn build(self) -> AccuracyEvaluator | Build evaluator |
Evaluation
CriteriaEvaluator
Evaluate against custom criteria with weighted scores.Builder Methods
| Method | Signature | Description |
|---|---|---|
new() | fn new() -> CriteriaEvaluatorBuilder | Create builder |
criterion(name) | fn criterion(impl Into<String>) -> Self | Add criterion |
threshold(n) | fn threshold(f64) -> Self | Pass threshold |
build() | fn build(self) -> CriteriaEvaluator | Build evaluator |
Example
PerformanceEvaluator
Measure execution performance.Configuration
| Option | Type | Default | Description |
|---|---|---|---|
max_duration | Duration | 30s | Maximum allowed time |
max_ttft | Option<Duration> | None | Max time-to-first-token |
threshold | f64 | 0.7 | Pass threshold |
Example
Judge
LLM-based evaluation for complex judgments.Configuration
| Option | Type | Default | Description |
|---|---|---|---|
model | String | "gpt-4o-mini" | Model for judging |
temperature | f64 | 0.0 | LLM temperature |
system_prompt | Option<String> | None | Custom system prompt |
Example
Optimization Loop Pattern
Best Practices
Define clear evaluation criteria
Define clear evaluation criteria
Use specific, measurable criteria for consistent evaluation.
Set appropriate thresholds
Set appropriate thresholds
Start with 0.7-0.8 threshold and adjust based on use case.
Combine evaluators for comprehensive assessment
Combine evaluators for comprehensive assessment
Use AccuracyEvaluator + PerformanceEvaluator for complete picture.
Log iteration history
Log iteration history
Track scores across iterations to identify improvement patterns.

