Deployment Policy

A deployment policy defines the enforcement rules for Substrate evaluation. It uses a two-layer design: an Ananke-signed base policy that sets minimum safety bounds, and an optional operator override layer that can only tighten those bounds. An optional adaptive escalation block controls actor-first retry behavior.

Schema Overview

{
  "schemaVersion": 1,
  "version": 1,
  "base": {
    "payload": {
      "gammaFloorMin": 0.15,
      "permittedModes": ["state_gate", "state_plus_action_gate"],
      "metricStalenessMaxMs": 60000,
      "requireMetricSignature": false,
      "failBehavior": "fail_closed"
    },
    "signature": "<base64url-rsa-pss-signature>"
  },
  "overrides": {
    "gammaFloor": 0.2,
    "mode": "state_gate"
  },
  "hitl": null,
  "adaptiveEscalation": null
}

Top-Level Fields

Field	Type	Description
`schemaVersion`	`u32`	Must be `1`
`version`	`u32`	Operator-incremented revision counter. Echoed in every `EvaluationResponse.policyVersion`. HITL tokens are bound to this version.
`base`	`SignedBasePolicy`	Ananke-signed minimum safety bounds
`overrides`	`OperatorOverrides?`	Operator-provided overrides that can only tighten the base
`hitl`	`HitlConfig?`	HITL override authority configuration
`adaptiveEscalation`	`AdaptiveEscalationConfig?`	Adaptive retry and escalation policy (opt-in)

Signed Base Policy

The base policy is signed by Ananke Labs using RSA-PSS. It establishes minimum safety bounds that operators cannot weaken.

{
  "payload": {
    "gammaFloorMin": 0.15,
    "permittedModes": ["state_gate", "state_plus_action_gate"],
    "metricStalenessMaxMs": 60000,
    "requireMetricSignature": false,
    "failBehavior": "fail_closed"
  },
  "signature": "<base64url-rsa-pss-sha256>"
}

Field	Type	Description
`gammaFloorMin`	`f64`	Minimum gamma floor — operators can raise this but not lower it
`permittedModes`	`EnforcementMode[]`	Modes the operator is allowed to select
`metricStalenessMaxMs`	`u64`	Maximum age of metric snapshots in milliseconds
`requireMetricSignature`	`bool`	Whether HMAC signature verification is required
`failBehavior`	`string`	`fail_closed` or `fail_open`

Fail Behavior

Value	Meaning
`fail_closed`	Missing or stale metrics cause `REJECT_STALE_METRICS`
`fail_open`	Missing or stale metrics are permitted (evaluation proceeds)

Operator Overrides

Operators can tighten the base policy without Ananke’s involvement. All overrides are optional — only specified fields take effect.

{
  "gammaFloor": 0.2,
  "mode": "state_gate",
  "metricStalenessMaxMs": 30000,
  "failBehavior": "fail_closed"
}

Field	Type	Constraint
`gammaFloor`	`f64?`	Must be ≥ `gammaFloorMin` from base policy
`mode`	`string?`	Must be in `permittedModes` from base policy
`metricStalenessMaxMs`	`u64?`	Must be ≤ base `metricStalenessMaxMs`
`failBehavior`	`string?`	Can tighten `fail_open` → `fail_closed` but not reverse

Enforcement Modes

Mode	Behavior
`observe`	Evaluate but never reject — all decisions are `PASS`
`state_gate`	Reject if $\gamma$ falls below the floor
`state_plus_action_gate`	State gate + action preview gating

Policy Resolution

The runtime resolves the effective policy by combining base and overrides:

Start with base policy values
Apply each override field if present and valid (tightening only)
Reject the policy load if any override attempts to weaken the base

The resolved values are used for all evaluations. The resolution happens once at policy load time, not per-request.

HITL Configuration

The hitl block configures Human-in-the-Loop override token verification. When absent or null, all HITL override tokens are rejected with HitlNotConfigured.

{
  "hitl": {
    "maxTokenTtlMs": 600000,
    "authorities": [
      {
        "keyId": "operator-1",
        "operatorId": "alice",
        "publicKeyPem": "-----BEGIN PUBLIC KEY-----\n...\n-----END PUBLIC KEY-----\n"
      }
    ]
  }
}

Field	Type	Description
`maxTokenTtlMs`	`u64`	Maximum token time-to-live. Tokens with `expiresAt - issuedAt > maxTokenTtlMs` are rejected.
`authorities`	`OverrideAuthority[]`	List of trusted key/operator bindings. Must be non-empty.

Override Authority

Field	Type	Description
`keyId`	`string`	Identifier used to look up the public key from the token envelope. Must be unique.
`operatorId`	`string`	Operator whose signature this key represents. Must match `payload.operatorId` in the token.
`publicKeyPem`	`string`	RSA public key in PKCS#8 PEM format for RSA-PSS signature verification.

Validation Constraints

Enforced at policy load time:

maxTokenTtlMs must be greater than zero
authorities must be non-empty
All keyId values must be unique

Adaptive Escalation Configuration

The adaptiveEscalation block controls actor-first retry behavior and escalation routing. When absent, null, or enabled: false, all existing behavior is unchanged — evaluations still produce escalation directives but no retry tracking, novelty scoring, or adaptive routing is applied.

When enabled: true, the runtime actively tracks retry attempts per actor/intent, scores novelty against recent history, applies budget costs, and routes escalation decisions through a monotonic merge that never downgrades from HUMAN_ESCALATION.

{
  "adaptiveEscalation": {
    "enabled": true,
    "rejectStateMaxReformulations": 3,
    "rejectActionMaxReformulations": 2,
    "attemptWindowSize": 5,
    "immediateHuman": {
      "gammaHeadroomLte": -0.15,
      "stepsToBreachLte": 1.0,
      "criticalityGte": 0.95
    },
    "novelty": {
      "minScore": 0.25,
      "veryLowScore": 0.10,
      "lowScoreBudgetCost": 1.5,
      "veryLowScoreBudgetCost": 2.0,
      "repeatFingerprintLimit": 2
    },
    "stall": {
      "minHeadroomImprovement": 0.03,
      "maxFlatAttempts": 2,
      "maxIntentAgeMs": 90000
    },
    "operatorLoad": {
      "dedupeByIntent": true,
      "maxPendingPerActor": 1,
      "cooldownAfterDenyMs": 300000,
      "requireMaterialChangeAfterDeny": true
    }
  }
}

Top-Level Adaptive Fields

Field	Type	Description
`enabled`	`bool`	Whether adaptive escalation is active. When `false`, the block is ignored.
`rejectStateMaxReformulations`	`u32`	Maximum reformulation attempts before escalation for `REJECT_STATE`. Must be > 0 when `enabled`. Defaults to `0` when omitted (for disabled configs).
`rejectActionMaxReformulations`	`u32`	Maximum reformulation attempts before escalation for `REJECT_ACTION`. Must be > 0 when `enabled`. Defaults to `0` when omitted (for disabled configs).
`attemptWindowSize`	`u32`	Number of recent attempts to retain for novelty comparison. Must be >= 1 when `enabled`. Defaults to `0` when omitted (for disabled configs).
`immediateHuman`	`ImmediateHumanThresholds?`	Thresholds that trigger immediate human escalation regardless of retry budget.
`novelty`	`NoveltyPolicy?`	Novelty scoring and accelerated budget consumption settings.
`stall`	`StallPolicy?`	Stall and no-progress detection settings.
`operatorLoad`	`OperatorLoadPolicy?`	Operator-load protection settings.

Immediate Human Thresholds

These thresholds bypass the retry budget entirely — when any condition is met, the evaluation escalates to HUMAN_ESCALATION regardless of remaining budget.

Field	Type	Description
`gammaHeadroomLte`	`f64?`	Escalate immediately when gamma headroom is at or below this value.
`stepsToBreachLte`	`f64?`	Escalate immediately when steps-to-breach is at or below this value.
`criticalityGte`	`f64?`	Escalate immediately when criticality is at or above this value.

Tuning guidance: Start with gammaHeadroomLte: -0.15 (actor is already 15% below the floor), stepsToBreachLte: 1.0 (one step from breach), and criticalityGte: 0.95 (near-maximum criticality). Tighten these thresholds if too many routine rejections skip the retry budget and go directly to operators.

Novelty Policy

Novelty scoring compares each retry attempt against recent history using a weighted similarity model (40% strategy, 30% action, 20% mapped effect, 10% target). Low-novelty retries consume more budget, discouraging repetitive submissions.

Field	Type	Constraint
`minScore`	`f64`	Minimum novelty score for a weak reformulation. Must be in [0.0, 1.0].
`veryLowScore`	`f64`	Threshold for near-duplicate detection. Must be in [0.0, 1.0] and ≤ `minScore`.
`lowScoreBudgetCost`	`f64`	Budget cost multiplier for low-novelty retries. Must be >= 1.0. Max 3 decimal places.
`veryLowScoreBudgetCost`	`f64`	Budget cost multiplier for very-low-novelty retries. Must be >= 1.0. Max 3 decimal places.
`repeatFingerprintLimit`	`u32`	Maximum identical failure fingerprints before escalation.

Tuning guidance: A minScore of 0.25 and veryLowScore of 0.10 work well for most domains. Raise lowScoreBudgetCost (default 1.5) and veryLowScoreBudgetCost (default 2.0) to penalize low-effort retries more aggressively. Set repeatFingerprintLimit to 2 to escalate after two identical failures regardless of remaining budget.

Stall Policy

Stall detection identifies actors making no meaningful progress — either headroom is not improving between attempts, or the intent has been open too long.

Field	Type	Constraint
`minHeadroomImprovement`	`f64`	Minimum headroom improvement required between attempts.
`maxFlatAttempts`	`u32`	Maximum consecutive attempts with no meaningful improvement.
`maxIntentAgeMs`	`u64`	Maximum age of an intent before forced escalation. Must be > 0.

Tuning guidance: minHeadroomImprovement: 0.03 with maxFlatAttempts: 2 escalates after two consecutive retries that fail to improve headroom by at least 3%. Set maxIntentAgeMs to a reasonable session timeout (e.g., 90000 for 90 seconds) to prevent indefinite retry loops.

Operator Load Policy

Operator load settings are enforced durably by the HITL coordinator (not in-process). They protect operators from duplicate and repetitive escalations. These settings only take effect when adaptiveEscalation.enabled = true.

Field	Type	Description
`dedupeByIntent`	`bool`	Deduplicate pending escalations by `(actor, intent, fingerprint)`. The coordinator coalesces to the existing pending request and returns `409 CONFLICT`.
`maxPendingPerActor`	`u32`	Maximum concurrent pending human escalations per actor. Validated at policy load time; enforcement reserved for a future coordinator release.
`cooldownAfterDenyMs`	`u64`	Cooldown period after an operator denial. Re-submissions within this window are rejected with `409 CONFLICT`. Must be > 0.
`requireMaterialChangeAfterDeny`	`bool`	Require a different `failureFingerprint` after denial. Identical re-submissions are rejected with `409 CONFLICT`.

Tuning guidance: Enable dedupeByIntent: true to prevent the same failing intent from generating multiple pending requests. Set cooldownAfterDenyMs to 300000 (5 minutes) to give operators breathing room after a denial. Enable requireMaterialChangeAfterDeny to ensure actors actually change their approach before re-escalating. The coordinator uses null-safe composite matching, so legacy submissions without intentId or failureFingerprint still participate in dedupe via empty-string coalescing.

Adaptive Escalation Validation

Enforced at policy load time when enabled is true:

rejectStateMaxReformulations and rejectActionMaxReformulations must be > 0
attemptWindowSize must be >= 1
Novelty thresholds must be in [0.0, 1.0]
veryLowScore must be <= minScore
Budget costs must be >= 1.0 with at most 3 decimal places
maxIntentAgeMs must be > 0 when stall policy is configured
cooldownAfterDenyMs must be > 0 when operator load policy is configured

Request Fields

Evaluation requests accept two additional optional fields related to adaptive escalation:

Field	Type	Description
`intentId`	`string?`	Stable grouping key for retry accounting. Required when `adaptiveEscalation.enabled = true`; requests without it receive `MISSING_INTENT_ID`.
`strategyFingerprint`	`JSON?`	Actor-supplied opaque structure describing the strategic family of this attempt. Canonicalized and bounded to max 4 KiB serialized; exceeding the limit returns `STRATEGY_FINGERPRINT_TOO_LARGE`.

These fields are additive and optional when adaptive escalation is disabled. Old clients can safely omit them. When adaptiveEscalation.enabled = true, intentId is required and strategyFingerprint is used for novelty scoring.

Policy Validation

Use the CLI to validate a policy file before deployment:

# Validate structure and constraints
kairos policy validate policy.json

# Inspect resolved configuration
kairos policy inspect policy.json

policy validate checks:

JSON schema conformance
Base policy signature verification
Override tightening constraints
HITL configuration validity (if present)
Adaptive escalation constraints (if present and enabled: true)

policy inspect shows the resolved effective policy after applying overrides.

Complete Example

Baseline (No Adaptive)

{
  "schemaVersion": 1,
  "version": 1,
  "base": {
    "payload": {
      "gammaFloorMin": 0.15,
      "permittedModes": ["state_gate", "state_plus_action_gate"],
      "metricStalenessMaxMs": 60000,
      "requireMetricSignature": false,
      "failBehavior": "fail_closed"
    },
    "signature": "<base64url-rsa-pss-sha256>"
  },
  "overrides": {
    "gammaFloor": 0.2,
    "mode": "state_gate"
  },
  "hitl": {
    "maxTokenTtlMs": 600000,
    "authorities": [
      {
        "keyId": "operator-1",
        "operatorId": "alice",
        "publicKeyPem": "-----BEGIN PUBLIC KEY-----\n...\n-----END PUBLIC KEY-----\n"
      }
    ]
  }
}

This policy sets a base gamma floor minimum of 0.15 (Ananke-enforced), the operator raises the floor to 0.2, and escalation follows session headroom rules: REFORMULATE when headroom is in [0.1, 0.5), HUMAN_ESCALATION when headroom drops below 0.1. Without adaptive escalation there is no retry budget — the session produces undifferentiated escalation directives with no tracking of prior attempts, and the coordinator accepts every HUMAN_ESCALATION submission without dedupe or cooldown gating.

Adaptive Enabled

{
  "schemaVersion": 1,
  "version": 1,
  "base": {
    "payload": {
      "gammaFloorMin": 0.15,
      "permittedModes": ["state_gate", "state_plus_action_gate"],
      "metricStalenessMaxMs": 60000,
      "requireMetricSignature": false,
      "failBehavior": "fail_closed"
    },
    "signature": "<base64url-rsa-pss-sha256>"
  },
  "overrides": {
    "gammaFloor": 0.2,
    "mode": "state_plus_action_gate"
  },
  "hitl": {
    "maxTokenTtlMs": 600000,
    "authorities": [
      {
        "keyId": "operator-1",
        "operatorId": "alice",
        "publicKeyPem": "-----BEGIN PUBLIC KEY-----\n...\n-----END PUBLIC KEY-----\n"
      }
    ]
  },
  "adaptiveEscalation": {
    "enabled": true,
    "rejectStateMaxReformulations": 3,
    "rejectActionMaxReformulations": 2,
    "attemptWindowSize": 5,
    "immediateHuman": {
      "gammaHeadroomLte": -0.15,
      "stepsToBreachLte": 1.0,
      "criticalityGte": 0.95
    },
    "novelty": {
      "minScore": 0.25,
      "veryLowScore": 0.10,
      "lowScoreBudgetCost": 1.5,
      "veryLowScoreBudgetCost": 2.0,
      "repeatFingerprintLimit": 2
    },
    "stall": {
      "minHeadroomImprovement": 0.03,
      "maxFlatAttempts": 2,
      "maxIntentAgeMs": 90000
    },
    "operatorLoad": {
      "dedupeByIntent": true,
      "maxPendingPerActor": 1,
      "cooldownAfterDenyMs": 300000,
      "requireMaterialChangeAfterDeny": true
    }
  }
}

This policy adds adaptive escalation to the baseline:

Actors get 3 reformulation attempts for state rejections, 2 for action rejections
Retries with novelty below 0.25 cost 1.5x budget; below 0.10 cost 2x budget
Immediate human escalation if gamma headroom drops below -0.15 or criticality exceeds 0.95
Stall detection escalates after 2 flat attempts or 90 seconds of intent age
The coordinator deduplicates pending requests by (actor, intent, fingerprint) and enforces a 5-minute cooldown after denials