Provider config
Agent setup and model options
Security note: API keys should only be sent to a trusted local/private runner. The static website stores config in this browser only and masks keys after saving.
Benchmark runner
Select tests to run
Run output
Latest results
Benchmark run in progress…
Model suites can take a minute or two. Keep this tab open while the local runner scores extraction and QA checks.
No benchmark run yet.
Run archive
Benchmark history
Shows the latest 20 runs per page, retaining up to 100 total local lab runs.
| Runtime | Mode | Duration | ||||
|---|---|---|---|---|---|---|
| No history loaded yet. | ||||||
Page 1