Calibration Artifact

A calibration artifact defines how domain-native metrics map to the simulation parameters $\lambda$ (growth pressure) and $\gamma$ (structural stability). The artifact is a JSON file loaded at runtime by the Substrate evaluation pipeline.

Schema Overview

{
  "schemaVersion": 1,
  "domain": "ai_safety",
  "adapterVersion": 1,
  "lambdaScaling": { "...scaling function..." },
  "gammaScaling": { "...scaling function..." },
  "secondaryProxies": [],
  "composite": null,
  "calibratedAt": "2026-03-16T15:44:42.085Z",
  "corpusScore": null
}

Field	Type	Description
`schemaVersion`	`u32`	Must be `1`
`domain`	`string`	Domain identifier (e.g., `ai_safety`, `robotics`) — must match the license
`adapterVersion`	`u32`	Echoed in every `EvaluationResponse.adapterVersion` for audit
`lambdaScaling`	`ScalingFunction`	Primary $\lambda$ mapping
`gammaScaling`	`ScalingFunction`	Primary $\gamma$ mapping
`secondaryProxies`	`SecondaryProxy[]`	Additional metrics contributing to $\lambda$ or $\gamma$
`composite`	`CompositeConfig?`	Optional aggregation for combining multiple scaled values
`calibratedAt`	`string`	ISO 8601 timestamp of artifact creation
`corpusScore`	`CorpusScore?`	Optional calibration corpus statistics

Scaling Functions

Each scaling function maps a single domain metric to a simulation parameter.

{
  "proxy": "capabilityIndex",
  "fn": "log",
  "inputRange": [1, 1000],
  "outputRange": [0.1, 0.95],
  "clamp": true,
  "sigmoidK": null
}

Field	Type	Description
`proxy`	`string`	Name of the domain metric in `snapshot.metrics`
`fn`	`string`	Scaling function type (see below)
`inputRange`	`[f64, f64]`	Domain-native input range `[min, max]`
`outputRange`	`[f64, f64]`	Simulation output range `[min, max]`
`clamp`	`bool`	If `true`, inputs outside `inputRange` are clamped
`sigmoidK`	`f64?`	Steepness parameter for sigmoid only (higher = sharper transition)

Function Types

Type	Behavior	Good for
`linear`	Linear interpolation between ranges	Metrics with uniform sensitivity
`log`	Logarithmic mapping — compresses high-end inputs	Metrics with diminishing returns (e.g., capability scores)
`sigmoid`	S-curve with configurable steepness ( $k$ )	Metrics with a critical transition zone
`inverse`	Inverse relationship — high input maps to low output	Metrics where more = less stable
`step`	Binary threshold at the midpoint of `inputRange`	Hard cutoff metrics

Clamping

When clamp is true, metric values outside inputRange are clamped to the nearest bound before scaling:

Value below inputRange[0] → scaled as inputRange[0]
Value above inputRange[1] → scaled as inputRange[1]

When clamp is false, out-of-range values may produce outputs outside outputRange.

Secondary Proxies

Secondary proxies allow additional metrics to contribute to $\lambda$ or $\gamma$ :

{
  "proxy": "autonomyLevel",
  "mapsTo": "lambda",
  "fn": "sigmoid",
  "inputRange": [0, 10],
  "outputRange": [0.05, 0.95],
  "clamp": true,
  "sigmoidK": 5
}

Field	Type	Description
`proxy`	`string`	Metric name
`mapsTo`	`"lambda"` or `"gamma"`	Which parameter this proxy contributes to
`fn`	`string`	Scaling function type
`inputRange`	`[f64, f64]`	Input range
`outputRange`	`[f64, f64]`	Output range
`clamp`	`bool`	Clamping behavior
`sigmoidK`	`f64?`	Sigmoid steepness (sigmoid only)

When secondary proxies are present, their scaled values are combined with the primary scaling value. Without a composite configuration, the default aggregation is used.

Composite Aggregation

For advanced configurations, the composite block defines explicit aggregation strategies for combining multiple scaled values:

{
  "composite": {
    "lambda": {
      "aggregation": "weighted_mean",
      "components": [
        { "metric": "capabilityIndex", "weight": 0.7 },
        { "metric": "autonomyLevel", "weight": 0.3 }
      ]
    },
    "gamma": {
      "aggregation": "weighted_mean",
      "components": [
        { "metric": "alignmentScore", "weight": 0.5 },
        { "metric": "guardrailCoverage", "weight": 0.3 },
        { "metric": "humanOversightFreq", "weight": 0.2 }
      ]
    }
  }
}

Aggregation Types

Type	Formula
`weighted_mean`	$\sum w_i \cdot v_i / \sum w_i$
`weighted_max`	$\max(w_i \cdot v_i)$
`weighted_min`	$\min(w_i \cdot v_i)$
`product`	$\prod v_i^{w_i}$

Corpus Score

The optional corpusScore block records calibration quality metrics from the corpus used to generate the artifact:

{
  "corpusScore": {
    "meanAgentF1": 0.92,
    "meanAbsDelta": 0.03,
    "withinTolerancePct": 94.5,
    "corpusVersion": 1
  }
}

Field	Description
`meanAgentF1`	Average F1 score across agents in the calibration corpus
`meanAbsDelta`	Mean absolute delta between predicted and observed gamma
`withinTolerancePct`	Percentage of corpus evaluations within tolerance
`corpusVersion`	Version of the calibration corpus used

Complete Example

{
  "schemaVersion": 1,
  "domain": "ai_safety",
  "adapterVersion": 1,
  "lambdaScaling": {
    "proxy": "capabilityIndex",
    "fn": "log",
    "inputRange": [1, 1000],
    "outputRange": [0.1, 0.95],
    "clamp": true
  },
  "gammaScaling": {
    "proxy": "alignmentScore",
    "fn": "linear",
    "inputRange": [0, 100],
    "outputRange": [0.05, 0.95],
    "clamp": true
  },
  "secondaryProxies": [
    {
      "proxy": "autonomyLevel",
      "mapsTo": "lambda",
      "fn": "sigmoid",
      "inputRange": [0, 10],
      "outputRange": [0.05, 0.95],
      "clamp": true,
      "sigmoidK": 5
    },
    {
      "proxy": "humanOversightFreq",
      "mapsTo": "gamma",
      "fn": "log",
      "inputRange": [0, 100],
      "outputRange": [0.05, 0.95],
      "clamp": true
    },
    {
      "proxy": "guardrailCoverage",
      "mapsTo": "gamma",
      "fn": "linear",
      "inputRange": [0, 100],
      "outputRange": [0.05, 0.95],
      "clamp": true
    }
  ],
  "composite": null,
  "calibratedAt": "2026-03-16T15:44:42.085Z",
  "corpusScore": null
}

This artifact:

Maps capabilityIndex → $\lambda$ via logarithmic scaling (compresses high capability values)
Maps alignmentScore → $\gamma$ via linear scaling
Adds autonomyLevel as a sigmoid-scaled secondary contributor to $\lambda$
Adds humanOversightFreq and guardrailCoverage as secondary contributors to $\gamma$