Praxis · by Integrum · working draft 2026 · production leads, governance by-the-way

A workforce of AI agents on your real production work — where capability compounds as your organization reuses it.

One environment. Workers turn their own expertise into governed mini-programs — from a knowledge advisor with no interface of its own to a full built tool — every one produced by governed free-form codegen, every one riding a substrate that is already running.

Replaces the SaaS suite — doesn't join it Not vibe-coding · Not low-code Moat: capability compounds + transfers across your org Foundations: ratified specs + shipped engines

§ 00 The whole flow, at a glance

You direct. The agent builds. Everything ships governed.

One arc, whichever flavor you need: the worker judges, the agent does the work, and every mini-program leaves a signed trail.

1 · You directTalk to your agentA csq session — the AI executes every step; you supply the judgment.

→

2 · It buildsSix phases · you judge three gates

analyzeplanexecutevetcodifydeliver

The AI runs every phase; you sign off at the three highlighted gates.

→

3 · It shipsA governed mini-programHarness-only (existing UI) or built (its own frontend) — signed lineage attached.

→

4 · It lastsServe · compound · maintainReused across the org, kept current 24/7, never inherited unproven.

Integrum — Build

Priya K.

Indirect Tax

Work

Communicate

Build

Market

Turn my VAT judgment into an advisor the 12 offices can use themselves.

Your agent · csq

I mined your rulings and notes for the decision procedure — not a search index. Here's the plan for your sign-off.

✓ analyzeplan · your callexecutevetcodifydeliver

Gate 1 · Plan approval

Distill 6 rule-families · set the eval bar

I'll distill reverse-charge, place-of-supply and four exemption families, and hold the advisor to 0 findings on an adversarial eval before it can serve. Approve to proceed.

Your agent · csq

One judgment call to confirm: on the Polish drop-shipment you overruled the junior — should "never zero-rate without a valid intermediary VAT ID" be a hard refusal?

Reply to your agent, or approve the plan above…Send

Mock — illustrative of the net-new create-experience (§ 10). The agent executes; Priya judges at the gate. Today a certified operator facilitates the distillation; the fully self-driven surface is the ambition. Harness-only means the delivered advisor renders into the surface the 12 offices already use — the panel above is the build experience, not the advisor.

§ 01 The product

The worker instantiates the means of production.

Instead of renting a bloated suite they use ~5% of, a worker puts a workforce of agents on their real work and bottles their judgment into named, reusable units the whole organization runs and reuses. Three ways to see it:

→ scroll — the mini-program · it produces better · it compounds · illustrative figures, not live telemetry

Integrum — Work

The mini-program

Named, reusable units of work

VAT Treatment Advisor

✓ running1,240 runs · 12 offices

KYC Refresh Memo

✓ running88 runs

Vendor DDQ Responder

draftin eval

Transfer-Pricing Note

✓ running41 runs

Each is authored from a worker's own expertise, governed, versioned, and reused across the org — not a document, not a one-off.

Integrum — Outcomes · this quarter

It produces better

Throughput, quality, scale — measured

1,240

VAT queries self-served (was: Priya, one by one)

310 h

senior-review hours reclaimed

98%

house-standard pass rate on eval

offices served · 0 escalations back to Priya

The production revolution is the lead — governance is what makes this safe to run at that scale, not the pitch.

Integrum — Capability Ledger

It compounds

Cost per outcome bends down as the org reuses it

month 1month 6

cost / accepted outcome ↓ 71%reuse across 12 officesnew capability 62% composed from canon

The moat, made a P&L object — and it is portable across models on the reasoning path. This is what a stack of one-off apps can never show.

§ 02 The paradigm

Governed free-form codegen — wrapped, not caged.

The agent generates freely; a Cognitive-Orchestration harness wraps that generation — guides it to expert quality, guards it fail-closed, and makes it pass an eval — so you get full expressive power and safety. Picture it as layers around the generation, not a box it's confined to:

Guard · fail-closed (rules + hooks)

Guide · expert context (skills + agents)

Free-form generation — full power

✓ through the eval gate → verified, signed output

Vibe-codingFree-form, but no wrap — ships unchecked.

2000s low-codeWrapped, but caged to blocks — can't do real work.

Governed free-formFree-form core, fully wrapped — both.

There is no Lego catalog and no constrained action space. The agent generates free-form, drawing on broad, general building blocks — typed capability nodes and broad horizontals — as raw material where useful; generation is primary, the blocks are never a catalog it assembles from. Domain-specificity enters through the worker's distilled expertise, never pre-canned from a shelf — a niche "cross-border-VAT vertical" is rejected as a productized unit; the VAT judgment is distilled from the expert.

§ 03 The harness

Guide and guard — safety without a cage.

Five component types do two jobs. One adds expert competence; the other makes the agent fail closed. Neither narrows what it can express.

Guide

Adds competence

Skills and agents make a general model perform as the domain expert. Guidance only ever adds range — it never subtracts expressive power. This is why guidance is not the low-code cage: the cage subtracts; guidance adds.

Guard

Fails closed

Rules are soft; hooks are hard, running outside the model's context. A PDP/PEP fail-closed layer guarantees no code path reaches a governed operation without passing an enforcement point — at build and at serve time. Safety is structural, by-construction.

Commands · L4 workflow + gates Skills · L2 distilled context Agents · L1 specialists Hooks · L3 deterministic enforcement Rules · L3 soft constraints

§ 04 The interaction law

The AI executes every step. The human supplies strategic judgment.

Human-on-the-loop, not in-the-loop. Strong guidance means the draft arriving at each judgment point is already expert-grade — so the gates are low-friction, not no-attention.

The six-phase spine carries the work; the human owns exactly three hard gates.

Analyze

Plan

Gate 1

Execute

Vet

Codify

Gate 2

Deliver

Gate 3

§ 05 Compulsory eval

You cannot skip it. And "green" has to be earned.

Vetting is a phase, not a courtesy — multi-round adversarial red-team to a numeric convergence threshold set before round one and approved at the plan gate. Eval means actively trying to break the artifact, and it is a runnable test workflow, never a promise.

open findings per round →

15round 1

8round 2

3round 3

0converged

0 findings → promotable. The mutation-test negative control proves the eval can go red.

The mutation-test negative control is a non-waivable promotion blocker: the engine seeds deliberately-broken variants and proves the eval catches N-of-N planted errors. That converts "green" from belief into evidence — the thing that separates this from a copilot's confident dashboard. Every miss diagnoses to a named artifact; debugging is deterministic, not "re-chunk and pray."

Governance falls out here — the verification gradient, in code

auto-approved flagged held for a human blocked

§ 06 Accountability lineage

Never inherit unproven.

Every capability carries a signed, append-only record that travels with it. Reuse is not a copy — it is a choice between standing on proof and re-earning it.

Signed capability record · travels with the mini-program

named humanwho is accountable

passed evalconvergence + mutation control

bounded mandatethe operating envelope

approvalthe deliver-gate signer

freshness pinDE-VAT baseline 2026-03Integrum extension — not a Terrene spec field

The first four fields are the CARE / EATP / PACT / CO lineage; the freshness pin is Integrum's design extension on top of it.

Path A — stand on it

Inherit the proven record as-is. The freshness pin travels, so you can see whether the proof is current or superseded. The original named human stays accountable.

Path B — adapt it

The moment you change it, you re-vet locally and your name goes on the new record. You become the accountable human.

There is no blind inherit and no unvetted sideways propagation — even when a capability graduates to shared reuse, its proof is not auto-transferred to the consumer. The customer keeps the accountability; we give them the proof. "The AI did it" is never a defense, and Integrum is never in the customer's liability chain.

§ 07 One spectrum

The only thing that varies is the frontend.

Two flavors, one system. A worker moves from one to the other by adding a frontend — not by switching products. Same engine, same substrate, same harness, same eval, same lineage.

Two flavors · one system — the only delta is the frontend

Priya · harness-only

renders in the existing UI

A cross-border-VAT advisor. Most enterprise intellectual work. Runs as a governed backend service that renders into the surface workers already use. Has a backend — just no bespoke interface.

+a frontend

Tomás · built

its own interface

A compliance-bound CRM the firm doesn't otherwise have. Everything about Priya, plus a frontend the agent writes free-form — where the UI plumbing falls out of one registration.

VAT Treatment Advisor Harness-only · existing UI

Poland office — drop-shipment from the DE warehouse to a PL customer via a FR intermediary. Treatment?

Intra-community supply, zero-rated at the DE leg — provided the FR intermediary's VAT ID is valid and quoted; reverse-charge at the PL customer. Three exceptions apply.

✓ signedeval: 0 findingsDE-VAT baseline 2026-03accountable: P. Kaur

Partner Pipeline Built · its own UI

Prospecting · 2

Meridian Capital

Intro sent

Aster Health

Qualifying

follow-up due Tue

In review · 1

Novo Partners

Legal · consent logged

in review since Mon

Signed · 1

Lyon Industriel

Onboarded

✓ every change on the tamper-evident audit chain · updated 2m ago

§ 08 · A Walkthrough — the harness-only flow

Flow A

Priya bottles her VAT judgment for 12 offices.

Twelve country offices ping her ~30×/month. She puts her judgment on the platform so her expertise produces at scale without her in the loop.

Honest status: today a certified operator facilitates the elicitation and distillation; the fully self-driven conversational surface is the ambition (net-new — see § 10), not a reason to shrink the design.

Directs a csq agent

She talks, in plain language, to her main interaction agent — a csq session. The AI executes; she judges.

Distill judgment, not documents

The agent mines her decision procedure from her notes and rulings — not a retrieval index. This is where it beats a copilot: it elicits the tacit layer ("why did you overrule the junior here?").

Plan approval Hard gate

She approves what will be distilled and the numeric eval threshold — before any distillation runs.

Guide + guard, in her words

Each correction is classified for her as method (guide) or hard refusal (guard) and reflected back with its consequence. A decision she confirms — not a concept she has to learn.

Built onto the substrate — as a backend service

Her judgment becomes validated code-nodes; one registration renders it into the existing UI. It runs on csq, so it stays model-portable on the reasoning path (a fine-tuned adapter, if used, may need re-tuning on a model change).

Compulsory /vet attacks her reasoning

The platform generates the adversarial test set from a held-out slice, on a different model; she only confirms ground truth. The mutation control proves the eval can fail.

Codify Hard gate

She confirms each artifact against a real past case she owns — a factual match, never fluent prose. It converges to her bar.

Deliver — freeze, host, stamp the lineage Hard gate

Frozen, versioned, hosted; the signed lineage (with the freshness pin) is written and travels.

Serve · compound · maintain

The 12 offices self-serve with no install. A field-miss loop compounds quality; the 24/7 maintenance service keeps her advisor current. Reuse never inherits unproven.

§ 08 · B Walkthrough — the built flow

Flow B

Tomás's agent writes the CRM his firm needs.

Honest ICP: Tomás is only a customer if a cheap point-tool cannot serve him — compliance-bound, audit-required, org-compounding. If he just needs a plain CRM, he should buy Pipedrive. We don't position against a strawman suite.

Directs a csq agent · coding archetype

Same engine Priya talks to. The AI writes all the code; Tomás supplies judgment.

Analyze the work and the obligations

Entities, pipeline, reminders — and the compliance rules (consent, retention, who may reopen a closed record). Data-gravity named honestly.

Plan approval Hard gate

He approves the data model, the hard-refusal rules, and the eval threshold — before any code is written.

Implement — the agent writes it free-form

Schema, backend services, and bespoke logic as validated code-nodes. The UI plumbing falls out of one registration; the agent writes the frontend views against that stream. No separately hand-built server.

May embed an autonomous component

An auto-follow-up agent shipped inside the CRM — carrying its own signed lineage and bounded mandate, acting only through fail-closed enforcement.

Compulsory /redteam

The same non-skippable eval as /vet. "What promises a discount? What reopens a closed record? What leaks one client's data into another's view?" Mutation control blocks a non-discriminating set.

Codify Hard gate

Tomás confirms each artifact — schema, each service, each hard-refusal hook, the embedded agent's mandate — against concrete cases he owns. Gate 2 always fires, even when nothing graduates.

Release — freeze, deploy, stamp the lineage Hard gate

Where segregation-of-duties matters, a distinct authority signs the deliver gate; the record names both. The pinned model tuple makes a silent model-rev an update event.

Runs in the governed runtime

Every action — including the embedded agent's — passes the fail-closed enforcement point and lands on the tamper-evident audit chain. That is the compliance value a point-tool cannot give.

§ 09 The recurring value

Continuous maintenance, as a service.

24/7 maintenance agents on Integrum's own backend — the company dogfooding its own operating model — watch external regulatory and best-practice sources, draft candidate updates, run the evals, and route to a human control point before anything publishes.

The client chooses — never silent

Each instance is notified and chooses whether to pull. Decline is a first-class, audited choice — and the freshness pin surfaces the staleness at point of use, so nothing serves superseded law behind a green stamp.

Every pull re-runs the compulsory eval

Base + overlay, never fork. A compliance update ships a paired gold-answer diff; the human oracle re-labels the touched cases against the new correct answer before it serves.

Targeted, tenant-isolated

A German rate change targets the DE cell only. Maintenance agents train on published sources — never on client usage, overlays, or data. Your customization never flows into another tenant's capability.

A model-rev is an update event

The service detects a provider model change against the pinned tuple and re-fires the compulsory eval before the new model serves — gating an adapter re-tune where one is needed.

§ 10 Under the hood · the architecture is already running

Four parts, one role each — on a stack with commit SHAs.

You don't operate these; the platform does. But for the architecture-minded: the agent you talk to and the agent that ships inside what you build are different engines, and keeping them straight is load-bearing.

csq

Role — the engine you talk to

The core autonomous engine powering the main agent every worker directs. Multi-CLI, multi-provider, so cross-model portability lives here — on the reasoning path (a per-tenant fine-tuned adapter may need re-tuning on a model change; see § 08·A). Shipped single-operator today; workforce-scale is net-new (see the stack below).

What it is not

Not a router bolted on the side, and not a separate serve-time product — the same engine spans build and serve.

Kailash

Role — the build substrate

What the agent produces onto: typed audited capability nodes, free-form bespoke code admitted as validated code-nodes, a hard build() validation gate, a one-registration surface (API + CLI + MCP + live UI stream), and the trust plane.

What it is not

Not a block catalog, and not the agent runtime — free-form code is first-class here, gated by validation.

loom

Role — the foundry

Distributes the harness via USE templates; enforces two human gates (classify / distribute); never originates content, so provenance stays intact; runs capability graduation as reuse grows.

What it is not

Not an author. It distributes and orchestrates; it does not write the artifacts.

kaizen-agents

Role — an embeddable component

An autonomous agent shipped inside a delivered mini-program (e.g. an auto-follow-up agent), or a build-time tool.

What it is not

Never the engine powering the agent the user talks to — that is always csq.

The layered stack — honest shipped / net-new labelsThe new platform is a consolidation of a real, operational architecture, not slideware. Each layer, and what is actually running:

Specs

CARE · EATP · PACT · CO — the trust plane, constrained-trust envelopes, verification gradient, and the CO method. Integrum implements them; the Terrene Foundation publishes them. Integrum never owns them.

Ratified / Published · CC BY 4.0

Foundry — loom

The loom→build→use cascade, two human gates, the variant compiler, provenance (never originates). Capability graduation is the reuse mechanism.

Sync verbs shippedRunning

Build engine — Kailash

Typed capability nodes, the build() validation gate, code-node validation, the EATP/PACT trust plane in code, the register-once surface (API + CLI + MCP + SSE).

Shippedrs↔py parityLoRA serving · behind flag, mock default

Session engine — csq

The shipped single-operator substrate that becomes the agent every worker directs. Per-slot provenance, transparency log, cross-model portability (reasoning-path; adapter paths re-eval / re-tune).

Running · single-operatorNet-new · workforce-scale

Runtime — aegis

Governed multi-tenant runtime: fail-closed isolation, 5-dimension envelopes, tamper-evident audit, org export/import. Plus kaizen-agents (component) and envoy (provenance).

Shipped · one live clientenvoy alpha

The new product

The all-in-one surface, the worker-facing mini-program host, the non-coder create-experience, the marketplace. Net-new — but orchestration + a new surface over a running substrate, not greenfield.

Net-new · weeks-not-years

Source anchors (opened, not asserted): kailash-rs @ f43eba0c (proprietary) · kailash-py @ babc7071e (Apache-2.0) · loom @ 6ee01b6f · CO / atelier @ 32bf7ab. The hard foundational layers are running today; the new product assembles them into one line.

§ 11 Why this wins

Separate the moat from the reason to buy now.

The moat — why we win, durable

Compounding + portability

Capability that gets cheaper and better as one organization reuses it, and stays portable across models on the reasoning path (a per-tenant fine-tuned adapter may need re-tuning on a model change — portability is degraded there). This must be demonstrated on the ledger — the compulsory eval turns the claim into evidence.

The purchase trigger — why buy now

The regulatory window

The dated EU regulatory window is the budget line that gets the cheque signed this year. It is the trigger, not the value — never conflated with the moat.

Perfect the user flow, and it becomes the spine that consolidates a real running architecture into one product line.