Technical overview

How a verdict becomes provable

Chance is the verification harness for the agent economy — the independent judge between an agent's intent and its execution: PROPOSE → VERIFY → SETTLE. This page is the VERIFY step, and how its verdict is made provable — wrapped in four guarantees (integrity, attribution, permanence, and independent verifiability) without changing how the agent works. The agent runs; the harness records, signs, anchors, and exposes the proof.

RunSignAnchorVerifyhow a verdict is made provable — each stage independently recomputable
The problem

Capability is here. Accountability isn't.

An agent is useful because it reasons on its own. It's risky for the same reason: when it's wrong, real funds move — and nothing today checks the order against what you actually asked for. AI agents are already taking consequential, autonomous actions — approving payments, moving treasury, and, in Chance's case, trading prediction markets with real capital. What's missing isn't capability. It's the ability to prove what an agent did after the fact:

  • ·Can anyone prove the exact inputs an agent saw, the steps it took, and the answer it produced — without trusting the company that ran it?
  • ·Can you prove the decision came from the code you think it did — not a swapped model or a tampered prompt?
  • ·Can a third party — an LP, an auditor, a counterparty — verify all of it independently, with no access to your servers?

Today the answer is a log file the operator can edit. That's not accountability; it's a promise. The harness replaces the promise with a proof.

Stage 1 — Run

The flight recorder

As the agent works — every web search, every source, every intermediate conclusion, the final decision — each event is appended to an append-only hash chain:

h₀ = keccak256("chance-harness:transcript:v1")        // fixed genesis
hᵢ = keccak256( hᵢ₋₁ ‖ keccak256(lineᵢ) )              // each line folds in the previous

Every line's hash depends on every line before it. The final value — the transcript root, a single 32-byte number — commits to the entire decision process. Change, delete, or reorder one byte and the root changes completely. It's a black-box flight recorder for AI decisions — except it's cryptographically sealed and anyone can open it.

Stage 2 — Sign

The attested judge key

The harness computes a digest that binds five things together:

digest = keccak256( abi.encode(
  chainId, registryAddress, runId, transcriptRoot, outputHash
))

A dedicated judge key signs this digest. One signature attests to the decision, the process that produced it, and the deployment it belongs to — and because chainId and registryAddress are inside the digest, it can never be replayed against another chain or contract.

Where the key lives — the TEE

The judge key is generated insidean Azure confidential-compute enclave (AMD SEV-SNP — hardware-isolated, memory-encrypted) and never leaves it. It exists in no database, no disk, no key vault — only in the enclave's encrypted memory:

  1. 1The harness runs in a Confidential Container whose code is measured into a hardware-rooted policy hash.
  2. 2At boot it generates the secp256k1 judge key in-enclave, then gets an attestation token from Microsoft Azure Attestation (MAA), rooted in the AMD security processor, whose runtime data carries the judge address.
  3. 3The token therefore certifies “this exact code, in genuine SEV-SNP hardware, holds this key.” Modify the code and the policy hash changes — the attestation no longer matches.

So the signature doesn't just say “someone at Chance signed this.” It proves this exact, unmodified harness code, on genuine secure hardware, produced this decision. The key cannot exist outside the attested code — not for an operator, not for an attacker who roots the host, not for Chance itself.

Why a separate key: the gas payer (a hot wallet that pays fees) and the judge (the signer that authorizes a record) are deliberately different roles. Anyone can relay a record onchain; only the attested judge can make it valid. That separation is what lets the signing key stay sealed in the TEE while ordinary infrastructure handles the rest.

Stage 3 — Anchor

The public record

A transaction submits the signed record to RunRegistry, a smart contract on Base (an Ethereum L2). The contract recovers the signer with ecrecover, requires it to be a registered judge, stores the commitments permanently, and emits an event. A verifyRun() view lets anyone ask the chain, for free, whether a given set of commitments is anchored.

Permanent
The record outlives Chance's servers — even if the company vanished, the proof stands.
Neutral
Counterparties trust the chain, not each other's databases, or ours.
Negligible cost
~180k gas — a fraction of a cent on Base. Cheap enough to anchor every decision.
Stage 4 — Verify

Don't trust — recompute

This is what makes the harness more than internal logging. The verification panel runs entirely in the verifier's browser, against the published transcript, and re-derives every claim from scratch:

CheckRecomputesProves
Transcript integritythe hash chain over every linethe process shown is byte-for-byte what was committed
Output commitmentkeccak256(output) vs the committed hashthe answer you read is the answer that was signed
Judge signatureecrecover on the digestthe attested, TEE-held judge key authorized it
Attestationthe run's MAA token against the policythe decision came from the exact attested code
Onchain anchorverifyRun() on Basethe public record agrees with all of the above

The verifier is ~50 lines of open code calling standard primitives — any third party can reimplement it. To make it visceral, the console has a tamper toggle: flip it and one transcript byte changes. Instantly the hash chain breaks, the signature recovers to a stranger, and the registry rejects the modified copy. The system doesn't ask you to believe it's tamper-evident; it shows you the tamper being caught.

Applied to Chance

The judge between intent and execution

Chance runs LLM agents that trade prediction markets under a mandate: a plain-English thesis plus hard limits (e.g. only enter ≥95¢ favorites, max $10 per market, US-politics only). A deterministic guard blocks limit violations — but it can't read intent, and when it blocks it leaves only an editable log line.

The harness is the VERIFYstep between the agent's proposal and the live order: the agent proposes a trade, a separate model runs an intent-match check against both the numbers and the plain-English intent, and returns a verdict — ALLOW / BLOCK / ESCALATE — with its reasoning attached. It's fail-closed: only an ALLOWlets the order through. It catches what a numeric guard misses — a trade that passes every limit but breaks the strategy's theme, or a side-inversion where the agent is about to buy the opposite token to its thesis (a real incident: expecting a 96¢ fill against a 4¢ live ask). And the verdict runs the full pipeline, producing what a guard never can: a permanent, third-party-verifiable on-chain receipt that the agent was supervised, trade by trade.

Security model

What is and isn't guaranteed

Guaranteed
  • Tamper-evidence. Any change to a published transcript is detectable by anyone.
  • Attribution to attested code. A valid record could only be signed by the enclave-held key; a swapped model or edited prompt changes the policy hash and the key is never released.
  • Non-repudiation & permanence.Once anchored, a decision can't be silently un-made, back-dated, or denied.
  • Independent verifiability. No trust in Chance is required to check any of it.
Not claimed
  • It proves what the agent did and that it ran in attested code — notthat the decision was wise. Garbage in, provable garbage out. (That's what the agents' own mandates and risk limits are for; the harness proves they were followed.)
  • It anchors commitments (hashes) onchain, not the raw transcript — the transcript is published off-chain at a committed URI that anyone can fetch and check against the anchored hash.
  • Availability of that off-chain transcript is an operational concern (durable storage), distinct from the integrity guarantee, which is absolute.
Check it yourself

Run a live decision, watch the proof recompute in your browser, click through to the real transaction on Basescan, and flip tamper mode to watch it caught.

Run a live decision →

Stack: Next.js app (console + harness API); LLM via Claude (Opus-class) with server-side web search; transcript hash chain mirrored for browser verification; RunRegistry.sol on Base; judge signer behind a RunSigner interface with an Azure confidential-compute backend in production.