How We Built an Append-Only Provenance Ledger

TL;DR

—"Append-only" enforced only at the application layer is fragile — a future developer or migration can break it without knowing
—Real immutability requires defense in depth: schema design, application-level guards, database-level constraints, and PostgreSQL Row Level Security
—Verification queries run against a separate read-optimized index — never against the primary ledger — to prevent read/write contention and ensure sub-100ms response times
—The separation between write store and read store means the index can be rebuilt from the primary ledger at any time, making the architecture auditable and recoverable

Why "Just Don't Write UPDATE" Is Not Enough

When we talk about an append-only ledger, the naive interpretation is: write an insert policy and trust developers to follow it. Don't use UPDATE. Don't use DELETE. Problem solved.

This approach fails in practice for three reasons.

First, future developers joining the codebase may not know the constraint exists. There is no enforcement mechanism — only convention. A well-intentioned bug fix or migration written by someone unfamiliar with the requirement can introduce a mutation.

Second, database migrations are a common source of accidental mutation. Adding a column with a DEFAULT value, running a backfill, or applying an ORM-generated migration can trigger implicit updates on existing rows. None of these operations look dangerous until you realize they are modifying records that are supposed to be permanent.

Third, ORM abstractions make it easy to call update-equivalent methods without realizing it. Prisma's upsert, for example, will update an existing record if the unique constraint matches. A developer using upsert for convenience on the provenance table would silently violate immutability.

Ledgible's append-only guarantee is enforced at four independent layers. If any single layer is bypassed, the others remain.

Layer 1: Schema Design That Removes Mutation Vectors

The most effective way to prevent accidental mutation is to remove the fields that enable it.

The provenance_records table has no updated_at column. There is no field to update, which eliminates the most common accidental mutation — a developer who reaches for updated_at to track when a record was last touched and inadvertently treats the table as mutable.

The signed_at column uses a server-side default (DEFAULT now()) and is explicitly excluded from the API request schema. It is not user-settable. The timestamp on a provenance record reflects when the Ledgible server wrote the record — not when the client claims it was created. This distinction matters under regulatory scrutiny: a client-supplied timestamp can be set to any value; a server-side default cannot be backdated.

The canonical_hash and asset_id fields carry UNIQUE constraints. A second ingest call for the same asset hash will fail with a constraint violation before any data is written. This is intentional. The ledger is not a versioning system. If an asset changes, it is a new asset with a new hash and a new record. The old record is permanent.

Layer 2: Application-Level Guards

At the application layer, the ingest route only ever calls prisma.provenanceRecord.create — never update, upsert, or any other write variant.

This is enforced in two ways. First, code review policy: any PR touching the provenance ingest path that introduces a non-create write operation requires explicit sign-off from a second reviewer who understands the immutability requirement. Second, automated checks in CI that grep for ORM write operations on the provenance_records model other than create and fail the build if any are found.

# Fail if any non-create write operation appears on provenance_records
grep -r "provenanceRecord\.(update|upsert|delete)" src/ && exit 1 || exit 0

This is deliberately low-tech. The goal is not sophisticated static analysis — it is a fast, readable check that catches the most common accidental violation.

Layer 3: PostgreSQL Row Level Security

Application-level guards can be bypassed by direct database access — a migration tool, a database client, an admin running a one-off query. PostgreSQL Row Level Security (RLS) enforces access policy at the database level, independent of the application.

-- Enable RLS on the provenance_records table
ALTER TABLE provenance_records ENABLE ROW LEVEL SECURITY;

-- Allow inserts for authenticated roles
CREATE POLICY insert_only ON provenance_records
  FOR INSERT WITH CHECK (true);

-- No UPDATE policy is defined.
-- No DELETE policy is defined.
-- Any attempt to update or delete raises a permission error
-- regardless of the caller — application, migration tool, or admin.

With this policy in place, even a database administrator with full credentials cannot UPDATE or DELETE a provenance record without first disabling RLS — an action that is logged and auditable. This is the difference between a convention and a constraint.

Layer 4: The Unique Constraint as Last Resort

The canonical_hash unique constraint is the last line of defense. If all three layers above fail and someone manages to attempt a second write for an asset that already has a record, the constraint violation prevents it at the database level before any data is written.

CREATE UNIQUE INDEX idx_prov_hash ON provenance_records (canonical_hash);

This constraint does double duty: it enforces immutability as a fallback, and it makes the hash lookup in the verification index fast. A unique index on the primary lookup key is standard practice — the immutability benefit is a side effect of correct index design.

The Two-Store Architecture: Write Ledger and Read Index

Separating reads from writes is the architectural decision that makes the system work at scale.

The provenance_records table is the source of truth. It is append-only, optimized for writes, and never queried by the public verification endpoint. It is the legal record.

The verification_index table is a denormalized, hash-keyed read store. It contains only the fields needed to answer a verification query — canonical_hash, asset_id, signer, tool_id, signed_at, and hitl_status. Its sole design constraint is that hash lookups must complete in under 100 milliseconds.

Both tables are written in the same database transaction on every ingest:

BEGIN;

INSERT INTO provenance_records (
  id, org_id, asset_id, asset_type, canonical_hash,
  creator_id, tool_id, signed_at, signature, metadata
) VALUES ($1, $2, $3, $4, $5, $6, $7, now(), $8, $9);

INSERT INTO verification_index (
  canonical_hash, asset_id, org_id, signer, tool_id,
  signed_at, hitl_status
) VALUES ($5, $3, $2, $10, $7, now(), 'verified_automated')
ON CONFLICT (canonical_hash) DO NOTHING;

COMMIT;

If either write fails, both roll back. There is no eventual consistency gap between the write ledger and the read index. The moment an asset is ingested, it is verifiable.

Why This Architecture Beats Blockchain for Enterprise Use Cases

Blockchain-based provenance registries offer a genuine advantage: decentralization means the record survives even if the platform shuts down. For enterprise content pipelines, it introduces costs that outweigh the benefit at the current stage:

•Latency: Blockchain writes require confirmation time — seconds to minutes depending on the chain and gas pricing. Ledgible ingest completes in under 200ms.
•Cost at volume: Gas fees on Ethereum or transaction fees on other chains accumulate quickly at the volumes enterprise pipelines produce.
•Integration complexity: Wallet management, key custody, and on-chain error handling add engineering overhead.
•Auditability: A PostgreSQL ledger with RLS, unique constraints, and CI-enforced write policies is easier to audit than a smart contract — and the audit trail is more legible to legal and compliance teams.

Ledgible treats blockchain anchoring as an optional Phase 2 feature for enterprise customers who need sovereign verifiability. The anchor receipt is stored alongside the provenance record as a supplemental proof — not as the foundation of the trust model.

Rebuilding the Index from the Ledger

One property of the two-store architecture worth highlighting: the verification index is fully reconstructible from the primary ledger.

If the index ever becomes inconsistent — due to a deployment error, a failed migration, or any other event — it can be rebuilt by reading the primary ledger and re-inserting into the index. The ledger is the ground truth; the index is a derived view.

-- Rebuild verification_index from provenance_records
INSERT INTO verification_index (
  canonical_hash, asset_id, org_id, signer, tool_id,
  signed_at, hitl_status
)
SELECT
  canonical_hash,
  asset_id,
  org_id,
  creator_id as signer,
  tool_id,
  signed_at,
  hitl_status
FROM provenance_records
ON CONFLICT (canonical_hash) DO UPDATE SET
  hitl_status = EXCLUDED.hitl_status,
  updated_at = now();

This property is important for enterprise customers who need to demonstrate that their audit trail is complete and consistent. The ability to regenerate the verification index from the append-only ledger is the proof that no records have been silently dropped.

What to Do Next

→See the data model — full schema reference including the two-store architecture →Try the verify endpoint — see the read store in action, under 100ms for any ingested asset →Start ingesting assets — free tier, no credit card required