Hash-chained audit — Modir Engineering

Audit logs are easy to write and hard to trust. Here is the full design we ended up with after three iterations: per-tenant SHA-256 hash chains, append-only database triggers, hourly Temporal verification, and content-addressed evidence with object-lock.

Every regulated platform has an audit log. Most of them are append-only by convention rather than construction. The team writes inserts into a table, the team agrees not to delete from it, the audit team trusts the team. The first regulator who asks "how do you know nothing has been changed?" usually does not get an answer the regulator finds satisfying.

The fix that has worked for us is not novel — Bitcoin made hash chains mainstream — but the integration into a regulated SaaS environment has enough subtle decisions that it is worth writing up. This post is the full design of Modir's audit chain: what it does, what it doesn't do, and why each of the choices is the way it is.

What we mean by tamper-evident

"Tamper-proof" is a marketing word; "tamper-evident" is the engineering goal. The chain cannot prevent a sufficiently privileged attacker from modifying the database. What it can do is guarantee that any modification is detectable, both immediately (by the next hourly verification run) and after the fact (by anyone with a copy of the chain head and a verifier script). The combination of "append-only at the database" plus "linked-hash verifiability" plus "off-site chain head" plus "object-lock evidence" gets us to evidentiary quality that satisfies regulators and survives forensic audits.

Construction

Every regulated action emits exactly one AuditEvent with the following fields: id (ULID), tenant_id, occurred_at (timestamptz UTC), actor (typed enum: user / system / workflow / AI), action (dotted name), resource_type + resource_id, metadata (JSONB), prev_hash, content_hash, optional vault_signature.

contentHash  = SHA256(canonicalJson({occurredAt, actor, action, resource, metadata}))
chainInput   = prevHash + ":" + contentHash
recordHash   = SHA256(chainInput)
genesisHash  = SHA256("wealthos-genesis:" + tenantId)

Canonicalization follows a simplified RFC 8785: keys sorted alphabetically, no insignificant whitespace, integers as integers, fractional zeros stripped, UTF-8 encoding. We initially considered CBOR for canonicalization speed, but the operational cost of training every reviewer to read a binary format outweighed the marginal performance benefit. JSON is the universally legible format. The verifier script is shorter and the regulators don't ask "what's a CBOR?"

Append-only at the database

The chain construction is necessary but not sufficient. If a privileged user can UPDATE a row, the chain becomes invalid — but only if you know to look. We push the constraint to the database with BEFORE UPDATE and BEFORE DELETE triggers that raise audit_events is append-only:

CREATE OR REPLACE FUNCTION raise_audit_immutable() RETURNS trigger LANGUAGE plpgsql AS $$
BEGIN
  RAISE EXCEPTION 'audit_events is append-only';
END;
$$;

CREATE TRIGGER audit_no_update BEFORE UPDATE ON audit_events
  FOR EACH ROW EXECUTE FUNCTION raise_audit_immutable();

CREATE TRIGGER audit_no_delete BEFORE DELETE ON audit_events
  FOR EACH ROW EXECUTE FUNCTION raise_audit_immutable();

A superuser can drop the trigger, of course. But dropping a trigger is itself an audit-able event in the WAL, and a Postgres dump comparison detects schema drift. The defense is not perfect; it is layered.

Partitioning

Audit events accumulate quickly. We partition audit_events by month and pre-create partitions for the current month plus the next twenty-four. The bootstrap script creates them; a monthly maintenance job rolls forward. Partitions allow detaching old data for cold storage without invalidating the chain — the prev_hash links remain valid; only the storage tier changes. We have not yet had to detach in production, but the operational drill is in our runbooks.

Hourly Temporal verification

The chain on disk is not enough. We need to know it is still consistent. auditVerificationWorkflow runs on Temporal cron at the top of every hour. It iterates every active tenant; reads their chain in batches of 10,000; recomputes contentHash and recordHash for each row; compares against the stored values. On mismatch, it opens a ComplianceCase with severity breach — the same severity tier as discovered fraud — and pages the on-call engineer.

The CLI does the same thing on demand: pnpm audit:verify [--tenant=<uuid>]. Exit code 1 on mismatch; useful for running outside Temporal during incident response.

Off-site chain head

Even if the chain on disk is consistent, an attacker who controls the entire database can roll back the chain to a previous state and recompute everything from there. The defense is to publish the chain head off-site, beyond the attacker's reach. Modir signs the chain head every hour with Vault transit (so the signature is also off-site, with its own access controls) and writes the head + signature to a separate region's S3 bucket with object-lock.

If an attacker rolls back the chain, the off-site head no longer matches. The match check is part of the verifier. The investment to compromise both the database and the off-site bucket and the Vault namespace is what we are betting against.

Content-addressed evidence

Audit events frequently reference documents — generated PDFs, e-signature receipts, KYC reports. We do not store the document bytes inside the audit metadata; we store the SHA-256 of the document, and the document itself lives in a content-addressed evidence bucket: evidence/{tenantId}/{sha256}. The bucket has object-lock in compliance mode — once a file is written, it cannot be deleted or overwritten for the retention period.

This separation has a useful property. The audit chain can be exported and verified by a regulator without exporting customer documents. The chain proves that a document with hash X existed and was acted on; the document itself is a separate access decision.

The GDPR question

The hardest design question we faced is how to handle the right to erasure under GDPR (and similar provisions in KSA PDPL). The chain is append-only by construction; deleting a row breaks it.

The pattern we landed on is "redaction with content tombstones." When a subject's data must be erased, we replace the PII fields in the resource record (the actual entity, not the audit event) with content-addressed tombstone references. The audit events that wrote the original data still exist — but they reference the resource, not its bytes. The bytes themselves are removed from the resource record. The chain stays valid; the regulator's request is honored; the audit trail still proves that the action happened, even though the personal data is gone.

What we don't do

We don't anchor the chain to a public blockchain. Public blockchains add operational complexity without buying us anything our regulators care about. The chain head off-site is sufficient for tamper evidence; further "trustless" properties are interesting in some adversarial settings, but ours is not one of them.

We don't sign every individual event with Vault. The cost is too high for the benefit; we sign the chain head hourly. Individual events are protected by being part of a chain whose head is signed.

We don't use Merkle trees instead of a linear chain. Merkle trees are useful if you need to prove inclusion of a single event without revealing the whole chain. Regulators want the whole chain. Linear is simpler and the verifier script is shorter.

Operational notes

Chain integrity is now the platform's first invariant. If auditVerificationWorkflow opens a breach case, every other change freezes until the chain is reconciled. We have run the drill twice in the last year, both times with planned chaos engineering: shut down the worker, manually corrupt a row in the staging chain, watch the alert fire, run the runbook. Both drills resolved cleanly; both were excellent training.

The verifier is also exposed as a Python script with zero dependencies, included in regulator-ready signed exports. Regulators can verify the chain themselves, on their own infrastructure, without trusting our tooling. So far, only one regulator has actually run it. That regulator was visibly delighted.

Why this matters

For most of the platforms we have replaced, the question "how do you know nothing has been changed?" was met with a tour of access controls and a quote from a SOC 2 report. That is not the same answer. Access controls describe who is allowed to make changes; the audit chain describes what was actually changed. The regulator's job is to evaluate the second question, not the first.

The day this design pays off is the day a regulator asks for a quarter's worth of audit evidence and gets it as a signed export verifiable on their own laptop. We have had that day. It is the day the platform team's mood changes.

Engineering posts represent the views of the authors.

How we built a tamper-evident audit log for regulated wealth management.

What we mean by tamper-evident

Construction

Append-only at the database

Partitioning

Hourly Temporal verification

Off-site chain head

Content-addressed evidence

The GDPR question

What we don't do

Operational notes

Why this matters

Why RTL is a first-class concern

AI you can defend to a regulator

Security overview

Request a 90-minute architecture workshop.

What we mean by tamper-evident

Construction

Append-only at the database

Partitioning

Hourly Temporal verification

Off-site chain head

Content-addressed evidence

The GDPR question

What we don't do

Operational notes

Why this matters

Read next

Why RTL is a first-class concern

AI you can defend to a regulator

Security overview

Request a 90-minute architecture workshop.