·5 min read·The PayGraph Team

The three-layer architecture of policy-controlled agent spending

A technical breakdown of the three layers every agent spending system needs: policy evaluation, approval routing, and immutable audit logs. With Python interfaces for each.

If you're designing a spending control system for AI agents from scratch, you'll converge on the same three layers every team converges on. Here's the architecture, why each layer is separate, and a concrete Python interface for each.

What is policy-controlled spending architecture?

Policy-controlled spending architecture is the pattern of separating an agent's payment execution into three distinct layers: a policy layer that evaluates rules, an approval layer that routes risky actions to humans, and an audit layer that records every attempt immutably.

PayGraph is an open-source SDK for policy-controlled spending, approvals, and audit logs for AI agents. It implements this three-layer pattern so you don't rebuild it yourself. The layers matter even if you roll your own — conflating them is the root cause of most agent payment incidents.

Why split it into three layers?

Each layer answers a different question and fails in a different way.

  • Policy layer answers: Is this transaction allowed by the rules? It's deterministic, synchronous, and runs on every call.
  • Approval layer answers: Does a human need to confirm this? It's asynchronous, state-bearing, and must handle timeouts.
  • Audit layer answers: What happened, and can I prove it later? It's append-only, immutable, and read-heavy at compliance time.

When teams merge these concerns, three things break. Policy logic leaks into the audit schema, so changing a rule rewrites history. Approval state gets lost when the agent process restarts. Audit logs become mutable because they're filtered by the same code that enforces policy. A SOC 2 auditor will find all three in an afternoon.

Here's the flow:

agent.tool_call(make_payment, $250, vendor_x)


  ┌─────────────┐
  │ 1. Policy   │  → reject → audit(DENIED)
  └─────────────┘
        │ allow / needs_approval

  ┌─────────────┐
  │ 2. Approval │  → timeout → audit(TIMEOUT)
  └─────────────┘
        │ approved

  ┌─────────────┐
  │ 3. Audit    │  ← writes every state transition
  └─────────────┘


   payment rail (Stripe Issuing, x402, ...)

Layer 1: the policy engine

The policy layer is a pure function. Input: a proposed transaction plus context. Output: one of ALLOW, DENY, or REQUIRES_APPROVAL. No side effects. No network calls. No LLM.

That purity is the point. A pure evaluator is testable, replayable, and auditable. You can run the full policy suite against yesterday's transaction log to see what a rule change would have blocked.

from dataclasses import dataclass
from enum import Enum
 
class Decision(Enum):
    ALLOW = "allow"
    DENY = "deny"
    REQUIRES_APPROVAL = "requires_approval"
 
@dataclass
class Transaction:
    amount_usd: float
    vendor: str
    category: str
    agent_id: str
 
class PolicyEngine:
    def evaluate(self, tx: Transaction) -> tuple[Decision, str]:
        if tx.amount_usd > self.hard_cap:
            return Decision.DENY, "exceeds hard cap"
        if tx.vendor not in self.vendor_allowlist:
            return Decision.DENY, "vendor not allowlisted"
        if tx.amount_usd > self.approval_threshold:
            return Decision.REQUIRES_APPROVAL, "above approval threshold"
        return Decision.ALLOW, "within policy"

A good policy engine supports per-agent policies (the research agent has a $50 cap, the ops agent has $5,000), category rules (ads allowed, crypto denied), velocity checks (daily and weekly caps), and time windows (no transactions between 2 AM and 6 AM without approval). For background on why per-agent scoping matters, see why AI agents need policy-controlled spending.

Layer 2: the approval router

When the policy engine returns REQUIRES_APPROVAL, the transaction enters the approval layer. This layer is where most custom implementations fall over, because it's the stateful one.

The approval router has four jobs:

  1. Persist the pending transaction durably — the agent process must be able to crash and restart without losing the approval state.
  2. Notify the approver through the right channel (Slack, webhook, email, custom UI).
  3. Accept the approver's response and resolve it to an APPROVED or REJECTED decision.
  4. Handle timeouts. An unanswered approval is a rejection, not a hang.
class ApprovalRouter:
    async def request(self, tx: Transaction, reason: str) -> Decision:
        approval_id = await self.store.create_pending(tx, reason)
        await self.notifier.send(
            channel="slack",
            approval_id=approval_id,
            tx=tx,
            reason=reason,
        )
        try:
            result = await self.store.wait(
                approval_id, timeout_seconds=300,
            )
            return result  # APPROVED or REJECTED
        except TimeoutError:
            await self.store.mark_timeout(approval_id)
            return Decision.DENY

Note the await. Approvals are asynchronous by nature — a human is in the loop. If your agent framework is synchronous, you need either a pause-and-resume pattern (LangGraph supports this with interrupt) or a background worker architecture. Blocking a thread for five minutes waiting on Slack is not the move.

Layer 3: the immutable audit log

Every transition across layers 1 and 2 writes to the audit log. Every one. Proposed, evaluated, approved, rejected, timed out, executed, failed — all of it.

The audit log has one hard rule: append-only. No updates, no deletes, ever. If you get this wrong, compliance breaks and incident forensics becomes guesswork.

PayGraph auditApplication logsPayment rail history
Pre-flight decisionsRecordedUsually notNo
Approval state transitionsRecordedScatteredNo
Immutability guaranteeAppend-onlyMutableRail-dependent
Includes denied attemptsYesSometimesNo
Schema stabilityVersionedDriftsExternal

A minimal interface:

class AuditLog:
    def record(self, event: AuditEvent) -> str:
        # returns content-addressed event id
        # append to durable store; never mutate
        ...
 
    def query(self, agent_id: str, since: datetime) -> list[AuditEvent]:
        ...

The query method is what turns the audit log from a compliance checkbox into an engineering asset. When an agent behaves unexpectedly, you replay its transaction history against the current policy and see exactly where divergence began.

Where to start

  • GitHub: github.com/paygraph-ai/paygraph — MIT licensed reference implementation of all three layers, with LangGraph bindings.
  • Docs: docs.paygraph.dev — policy reference, approval webhook contracts, and the audit log schema with versioning rules.
  • Discord: discord.gg/PPVZWSMdEm — architecture review channel for teams designing their own spending control layer.

Whether you adopt PayGraph or build your own, keep the three layers separate. Every production incident we've seen traces back to a team that merged two of them.