First Evaluation
This walkthrough runs a single evaluation using the kairos evaluate command. You need a built CLI binary and an activated license.
Prepare the Inputs
Section titled “Prepare the Inputs”An evaluation requires three files:
- Calibration artifact — defines how domain metrics map to simulation parameters
- Deployment policy — defines enforcement thresholds and mode
- Evaluation request — the actual metrics snapshot to evaluate
Calibration Artifact (artifact.json)
Section titled “Calibration Artifact (artifact.json)”{ "schemaVersion": 1, "domain": "ai_safety", "adapterVersion": 1, "lambdaScaling": { "proxy": "capabilityIndex", "fn": "log", "inputRange": [1, 1000], "outputRange": [0.1, 0.95], "clamp": true }, "gammaScaling": { "proxy": "alignmentScore", "fn": "linear", "inputRange": [0, 100], "outputRange": [0.05, 0.95], "clamp": true }, "secondaryProxies": [], "composite": null, "calibratedAt": "2026-03-16T15:44:42.085Z", "corpusScore": null}This artifact maps capabilityIndex → via a log function, and alignmentScore → via a linear function.
Deployment Policy (policy.json)
Section titled “Deployment Policy (policy.json)”{ "schemaVersion": 1, "version": 1, "base": { "payload": { "gammaFloorMin": 0.15, "permittedModes": ["state_gate"], "metricStalenessMaxMs": 60000, "requireMetricSignature": false, "failBehavior": "fail_closed" }, "signature": "<base64url-rsa-pss-signature>" }, "overrides": { "gammaFloor": 0.2, "mode": "state_gate" }}This policy sets a gamma floor of 0.2 — any evaluation where falls below 0.2 will produce a REJECT_STATE decision.
Evaluation Request (request.json)
Section titled “Evaluation Request (request.json)”{ "envelopeVersion": 1, "requestId": "first-eval-001", "snapshot": { "timestamp": "2026-03-21T12:00:00.000Z", "metrics": { "capabilityIndex": 450.0, "alignmentScore": 72.0 } }}Run the Evaluation
Section titled “Run the Evaluation”kairos evaluate \ --config artifact.json \ --policy policy.json \ --request request.jsonYou can also pipe the request from stdin:
cat request.json | kairos evaluate \ --config artifact.json \ --policy policy.jsonInterpret the Response
Section titled “Interpret the Response”A passing evaluation returns:
{ "envelopeVersion": 1, "requestId": "first-eval-001", "decision": "PASS", "reasonCode": "NONE", "mode": "state_gate", "policyVersion": 1, "adapterVersion": 1, "evaluation": { "currentGamma": 0.68, "gammaFloor": 0.20, "currentLambda": 0.89, "stability": 2.40, "predictedGamma": null }, "escalation": null, "overrideOutcome": null, "timestamp": "2026-03-21T12:00:00.500Z"}Key fields:
| Field | Meaning |
|---|---|
decision | PASS — the action is permitted |
reasonCode | NONE — no adverse condition detected |
evaluation.currentGamma | The computed value (0.68) |
evaluation.gammaFloor | The policy threshold (0.20) |
evaluation.stability | Normalized headroom: |
evaluation.currentLambda | The computed value (0.89) |
If alignmentScore were low enough to push below the floor (0.2), the response would show:
{ "decision": "REJECT_STATE", "reasonCode": "GAMMA_BELOW_FLOOR", ...}Next Steps
Section titled “Next Steps”- Learn about Fly-by-Wire concepts for stateful evaluation with engine telemetry
- Understand the calibration artifact schema to tune your metric mappings
- Explore the full CLI reference for all available commands