Cognitive Graphs: A General Architecture for Replayable Reasoning

Cognitive Graphs: A General Architecture for Replayable Reasoning
Page content

A Cognitive Graph is an enhanced chat log one that doesn’t just record what was said, but structurally preserves how thoughts, decisions, alternatives, and artifacts evolve over time, making the whole history replayable and auditable.

TL;DR: A Cognitive Graph turns AI-assisted work from lossy chat logs into event-sourced, replayable reasoning state. Instead of preserving only the final output, it preserves the decisions, alternatives, rationales, mutations, hashes, and snapshots that explain how the output came to exist.

Most AI tools remember outputs.

They do not remember how those outputs came to exist.

A user asks a question, receives an answer, edits it, compares it with alternatives, rejects parts, accepts others, and moves on. The final artifact survives, but the reasoning path usually disappears.

That loss matters.

In serious work, the path is often as valuable as the artifact. A code patch matters, but so do the tests, rejected approaches, tradeoffs, review comments, and final decision. A written paragraph matters, but so do the variants, constraints, editorial judgments, and reasons it was chosen. A research claim matters, but so do the sources, objections, confidence levels, and competing interpretations behind it.

A Cognitive Graph is a way to preserve that process.

It is a graph-based, event-sourced structure for storing how thoughts evolve into artifacts.

The key idea is simple:

The final artifact matters.
The reasoning path that produced it often matters more.

This post describes the architecture as a general pattern. I will use Writer, my AI-assisted writing system, as the implementation example later, but the idea is not limited to writing.

A Cognitive Graph can be used for:

  • code review
  • research synthesis
  • legal reasoning
  • policy analysis
  • agent workflows
  • human-AI collaboration
  • long-running design decisions

The common structure is always similar:

    graph LR
    intent[💡 Intent]
    alternatives[🔀 Alternatives]
    critique[🔍 Critique]
    revision[✏️ Revision]
    decision[✅ Decision]
    rationale[📖 Rationale]
    artifact[🏆 Artifact]

    intent --> alternatives
    alternatives --> critique
    critique --> revision
    revision --> decision
    decision --> rationale
    rationale --> artifact

    rationale -.->|🔄 revisit| intent

    classDef start fill:#e1f5fe,stroke:#01579b,stroke-width:2px,color:#000
    classDef process fill:#fff3e0,stroke:#e65100,stroke-width:2px,color:#000
    classDef decided fill:#e8f5e9,stroke:#1b5e20,stroke-width:2px,color:#000
    classDef final fill:#f3e5f5,stroke:#4a148c,stroke-width:2px,color:#000

    class intent start
    class alternatives,critique,revision,rationale process
    class decision decided
    class artifact final
  

The goal is to make that structure durable, inspectable, and replayable.


The Problem: AI Workflows Lose Their Most Valuable Data

Most AI-assisted workflows are lossy.

A user asks for help. The model produces something. The user keeps part of it, rejects part of it, asks for a revision, compares it with another answer, accepts a final version, and moves on.

The final artifact survives.

The reasoning path usually disappears.

That means the system loses:

  • why one variant was preferred over another
  • which alternatives were explicitly rejected
  • which instruction led to the useful output
  • which branch failed
  • which decision was human-made
  • how the artifact evolved across multiple attempts
  • whether the same reasoning could be replayed later

This problem is not limited to writing.

In code, the final patch survives, but the failed approaches, benchmark results, review comments, and tradeoffs often disappear.

In research, the final summary survives, but the rejected interpretations, source conflicts, and confidence judgments often disappear.

In design, the final decision survives, but the alternatives and constraints that shaped it are scattered across chat logs, tickets, documents, and memory.

The issue is not that AI systems cannot generate useful outputs. They can.

The issue is that most AI systems do not preserve the evolution of the work.

A typical AI workflow looks like this:

answer = model.generate(prompt)

# User edits the answer.
# User asks another model.
# User rejects part of it.
# User accepts a final version.

save(final_answer)

That gives you the artifact.

It does not give you the reasoning history.

A Cognitive Graph changes the unit of storage. Instead of only saving the final output, it records the transitions that produced it:

graph.append_node(type="intent", content="Improve this paragraph")

draft = graph.append_node(type="draft", content=first_answer)

revision = graph.append_node(type="revision", content=second_answer)

graph.link(draft, revision, type="replaces")

decision = graph.append_node(
    type="decision",
    content="Accepted the revision because it was clearer and shorter."
)

graph.link(decision, revision, type="accepts")

This is still simplified, but the shift is important.

The system no longer only knows:

Here is the final answer.

It also knows:

Here was the intent.
Here was the first attempt.
Here was the revision.
Here is what replaced what.
Here is the decision.
Here is why the decision was made.

That is the difference between storing output and storing cognition.

The core question becomes:

Can the system preserve not just the artifact, but the path that produced it?

A Cognitive Graph is one answer to that question.

It turns the hidden evolution of AI-assisted work into structured, replayable state.


The Core Idea: Reasoning as a Graph

A Cognitive Graph treats reasoning as a graph, not a transcript.

A transcript is chronological.

A graph is causal.

A transcript can tell you what was said. A graph can tell you what changed, what caused the change, which alternatives existed, and which path became canonical.

Instead of storing this:

prompt
→ answer
→ discarded context

we store this:

intent
→ draft
→ variant
→ critique
→ revision
→ decision
→ rationale
→ artifact

Every step can become graph state.

At the simplest level, a Cognitive Graph needs only a few primitives:

from dataclasses import dataclass


@dataclass
class Node:
    id: str
    type: str
    content: str
    metadata: dict


@dataclass
class Edge:
    id: str
    source_id: str
    target_id: str
    type: str
    metadata: dict

A node is a unit of thought.

An edge is the relationship between thoughts.

For example:

intent = graph.add_node(
    type="intent",
    content="Make this explanation clearer for a technical reader.",
)

draft = graph.add_node(
    type="draft",
    content="The system stores conversation history.",
)

revision = graph.add_node(
    type="revision",
    content="The system preserves the evolution of decisions, alternatives, and artifacts.",
)

graph.add_edge(
    source_id=draft.id,
    target_id=revision.id,
    type="replaces",
)

graph.add_edge(
    source_id=intent.id,
    target_id=revision.id,
    type="guides",
)

Now the system does not merely know the final sentence.

It knows:

what the user wanted
what the first attempt said
what replaced it
which intent guided the replacement

That is already more useful than a transcript.

But a Cognitive Graph becomes much more powerful when we add branches, events, snapshots, and decisions.


What This Is Not

Chat Log / Transcript Cognitive Graph
Chronological Causal
Stores outputs Stores transitions
Hard to replay Event-sourced and deterministic
Passive memory Executable memory
Context evaporates Lineage persists

A Cognitive Graph is not just a vector database. A vector database can retrieve similar content, but it does not usually preserve the causal evolution of decisions.

A Cognitive Graph is not just a knowledge graph. A knowledge graph stores entities and relationships. A Cognitive Graph adds mutation events, lifecycle state, replay, snapshots, hashes, and decision provenance.

A Cognitive Graph is not a chain-of-thought transcript. It is an external, structured, replayable system for preserving reasoning state.


The Basic Model

A practical Cognitive Graph has seven core parts:

CognitiveGraph
  ├── Nodes
  ├── Edges
  ├── Branches
  ├── Events
  ├── Snapshots
  ├── Decisions
  └── Artifacts

Each one has a different job.

Part Purpose
Node Stores a unit of thought, text, code, evidence, decision, or rationale
Edge Stores a typed relationship between nodes
Branch Stores an alternate reasoning path
Event Stores an append-only mutation to the graph
Snapshot Freezes a validated graph state
Decision Records a selection among alternatives
Artifact Stores an output that survived the process

Example node types:

intent
claim
draft
variant
critique
revision
test_result
review_note
decision
constitutional_reasoning
artifact

Example edge types:

refines
replaces
supports
contradicts
accepts
rejects
justifies
depends_on
canonicalizes

This gives us a compact language for modeling reasoning.

A rewritten sentence is not just new text. It is a node that may replace another node.

A review comment is not just a note. It is a node that may critique a variant.

A decision is not just a label. It is a node that may accept one path and reject another.

A final artifact is not just a file. It is the canonical descendant of a reasoning process.


The Build Arc: From Storage to Runtime

A Cognitive Graph does not need to emerge fully formed.

It can be built in layers. Each layer turns passive records into something more executable:

Layer Focus What Changes
Graph substrate Nodes, edges, branches Reasoning becomes structured instead of linear
Validation Types, references, lifecycle The graph can reject invalid cognitive state
Hashing Structure/content/state identity Reasoning states become comparable
Snapshots Frozen graph checkpoints Cognitive state can be preserved and restored
Mutations Explicit units of change The system can describe what changed
Event sourcing Append-only event stream State can be reconstructed from history
Replay verification Replay vs snapshot The history becomes executable and auditable
Event-first execution Events become authoritative State is derived from recorded mutations
Decisions Human/editor choices Judgment becomes graph state
Why Graphs Structured rationales Reasons become replayable state

Why a Graph Instead of a Chat Log?

Because serious work does not move in a straight line.

A conversation can fork.
One branch may explore the idea as prose. Another as code. Another may test assumptions. Another may produce a simpler explanation for readers. All of those branches may start from the same intent.

    graph LR
    intent[💡 Intent]
    prose_path[📝 Prose Path]
    code_path[💻 Code Path]
    test_path[🧪 Test Path]
    revision[✏️ Revision]
    patch[🔧 Patch]
    benchmark[📊 Benchmark]
    decision[✅ Decision]
    artifact[🏆 Artifact]

    intent --> prose_path
    intent --> code_path
    intent --> test_path

    prose_path --> revision
    code_path --> patch
    test_path --> benchmark

    revision --> decision
    patch --> decision
    benchmark --> decision

    decision --> artifact

    classDef start fill:#e1f5fe,stroke:#01579b,stroke-width:2px,color:#000
    classDef branch fill:#fff3e0,stroke:#e65100,stroke-width:2px,color:#000
    classDef action fill:#fff8e1,stroke:#f57f17,stroke-width:2px,color:#000
    classDef converge fill:#e8f5e9,stroke:#1b5e20,stroke-width:2px,color:#000
    classDef result fill:#f3e5f5,stroke:#4a148c,stroke-width:2px,color:#000

    class intent start
    class prose_path,code_path,test_path branch
    class revision,patch,benchmark action
    class decision converge
    class artifact result
  

A chat log flattens this structure.
A Cognitive Graph preserves it.

This matters because rejected branches are not always useless. A failed implementation may explain why a safer design was chosen. A rejected paragraph may contain the sentence that later becomes the opening. A failed hypothesis may become evidence for a better one. A graph lets the system keep those paths without pretending they all belong to one linear conversation.

But the difference runs deeper than forking. It changes the very nature of the record.

Chat Log Cognitive Graph
Organisation Chronological Causal
Content Outputs (what was said) Transitions (what changed, why)
Reproducibility Hard to replay Event‑sourced and replayable
Evolution captured A flat sequence Branches, decisions, alternatives, lineage
Memory type Passive memory Executable memory

That table is not just a comparison it’s the definitional boundary. A chat log can tell you what happened. A Cognitive Graph tells you what changed, who decided, which path survived, and why. It turns collaboration history into something you can audit, replay, and build on.


Validation: Turning a Graph Into a Runtime

A graph by itself is only structure.

It can store nodes. It can store edges. It can store branches.

But storage is not enough.

If a Cognitive Graph is going to preserve reasoning, it needs rules. Otherwise it becomes another pile of loosely connected records.

For example:

Can a rejected branch still receive new nodes?
Can an edge point to a missing node?
Can a graph with invalid references be snapshotted?
Can two nodes have the same identity?
Can a replay continue if the graph is structurally broken?

If the answer to those questions is vague, the graph cannot be trusted.

So the next step is validation.

A Cognitive Graph needs to know when it is valid, when it is unsafe to mutate, and when it is safe to freeze, replay, or compare.

At minimum, validation should check:

valid node types
valid edge types
valid branch states
missing node references
duplicate IDs
invalid artifact references
cycle rules
lifecycle constraints

A very small version might look like this:

ALLOWED_NODE_TYPES = {
    "intent",
    "draft",
    "variant",
    "revision",
    "decision",
    "reasoning",
    "artifact",
}

ALLOWED_EDGE_TYPES = {
    "refines",
    "replaces",
    "supports",
    "rejects",
    "justifies",
    "accepted_into",
}


def validate_graph(nodes, edges):
    node_ids = {node.id for node in nodes}
    errors = []

    for node in nodes:
        if node.type not in ALLOWED_NODE_TYPES:
            errors.append(f"Invalid node type: {node.type}")

    for edge in edges:
        if edge.type not in ALLOWED_EDGE_TYPES:
            errors.append(f"Invalid edge type: {edge.type}")

        if edge.source_id not in node_ids:
            errors.append(f"Missing source node: {edge.source_id}")

        if edge.target_id not in node_ids:
            errors.append(f"Missing target node: {edge.target_id}")

    return errors

That is not complicated.

But it changes the nature of the system.

The graph is no longer just a place where reasoning is stored. It becomes a state object with integrity rules.

Now we can say:

This graph is valid.
This branch can be mutated.
This snapshot is safe to create.
This replay is structurally trustworthy.

That is the first step from memory toward runtime.


Branch Lifecycle

Validation also needs to apply to branches.

A branch is not just a folder for nodes. It is a reasoning path with lifecycle state.

A simple branch lifecycle might be:

active
merged
abandoned
canonical

Each state should mean something.

For example:

def can_append_to_branch(branch):
    return branch.status == "active"

That sounds almost too simple, but it prevents a subtle class of errors.

If a branch was abandoned, and the system later appends new reasoning to it accidentally, the history becomes ambiguous. Did the branch really fail? Was it reopened? Did a later process corrupt old state?

Lifecycle rules remove that ambiguity.

active     → can receive new nodes
merged     → preserved but no longer independently mutating
abandoned  → preserved as history, not extended
canonical  → accepted as part of the final path

This matters because rejected or abandoned work is still valuable.

A failed branch may explain why a decision was made. A rejected variant may later become evidence. An abandoned implementation may document a constraint.

The system should preserve those branches, but it should not accidentally keep mutating them as if they were still live.


Traversal: Asking Questions of the Graph

Once a graph is valid, we can traverse it.

Traversal is what lets the system ask useful questions:

What led to this decision?
Which drafts did this artifact descend from?
What did this variant replace?
Which reasoning node justified this choice?
Which branches contributed to the final artifact?

A minimal traversal function might look like this:

def ancestors(node_id, edges):
    parents = {}
    for edge in edges:
        parents.setdefault(edge.target_id, []).append(edge.source_id)

    result = []
    stack = [node_id]

    while stack:
        current = stack.pop()
        for parent in parents.get(current, []):
            if parent not in result:
                result.append(parent)
                stack.append(parent)

    return result

With that, a final artifact is no longer just a blob of text or code.

It becomes the endpoint of a path.

intent
→ draft
→ critique
→ revision
→ decision
→ canonical artifact

That path is what makes the artifact explainable.

Without validation, snapshots can preserve broken state.

Without lifecycle rules, branches lose meaning.

Without traversal, lineage is just data you cannot query.

This is the point where the Cognitive Graph starts to become a runtime.

Not because it can generate anything.

Because it can enforce and inspect the structure of thought.


Deterministic Identity: Hashing Cognitive State

Once the graph can be validated, it needs identity.

Not just a database ID.

A database ID tells you where something is stored. It does not tell you what the thing is.

For a Cognitive Graph, that distinction matters.

Two graphs might live in different databases but represent the same reasoning. Two branches might have the same structure but different content. Two snapshots might contain the same artifact but different lifecycle state.

So the graph needs a way to say:

This reasoning state is identical to that reasoning state.

or:

This reasoning state has changed.

That is where deterministic hashing becomes useful.

A Cognitive Graph benefits from more than one identity:

Hash Meaning Example Question
structure The topology of the graph Did the reasoning path change shape?
content The semantic payload Did the text, code, claim, or artifact change?
state The exact runtime state Is this graph exactly the same checkpoint?

This distinction is important.

A graph can have the same structure but different content:

intent → draft → revision → decision

That path may be identical across two runs, even if the actual draft text differs.

A graph can have the same content but different runtime state. For example, the same artifact may exist in two graphs with different graph IDs, branch statuses, or evaluation scores.

So instead of treating identity as one thing, we separate it:

structure hash = how it is connected
content hash   = what it says
state hash     = exact runtime identity

This leads to one of the most important distinctions in the system:

graph_id = storage identity
hash     = cognitive identity

A graph_id is useful for lookup.

A hash is useful for comparison.

If two systems produce the same content hash, they may have arrived at the same reasoning payload even if they created it in different places.

If a snapshot’s state hash changes, the graph has drifted.

If a replay produces the same content hash but a different state hash, the reasoning content may match while runtime identity differs.

That is exactly the kind of distinction a serious reasoning system needs.


A Minimal Hashing Example

A tiny version might look like this:

import hashlib
import json


def stable_json(value: dict) -> str:
    return json.dumps(
        value,
        sort_keys=True,
        separators=(",", ":"),
        ensure_ascii=False,
    )


def sha256(value: dict) -> str:
    return hashlib.sha256(stable_json(value).encode("utf-8")).hexdigest()


def structure_projection(graph):
    return {
        "nodes": sorted(
            [{"id": n.id, "type": n.type} for n in graph.nodes],
            key=lambda n: n["id"],
        ),
        "edges": sorted(
            [
                {
                    "source": e.source_id,
                    "target": e.target_id,
                    "type": e.type,
                }
                for e in graph.edges
            ],
            key=lambda e: (e["source"], e["target"], e["type"]),
        ),
    }


structure_hash = sha256(structure_projection(graph))

The important part is not the specific code.

The important part is the rule:

Same graph state → same hash.
Different graph state → different hash.

To make that true, the hash input must be canonical:

sort keys
sort lists
remove timestamps
remove volatile metadata
preserve meaningful content

Without canonicalization, hashes become noisy. The system starts detecting accidental formatting differences instead of real cognitive differences.

Hashing turns the Cognitive Graph from a structure into something that can be verified.

Now the system can ask:

Did this branch actually change?
Does this replay match the snapshot?
Did this artifact drift?
Are these two reasoning paths equivalent?
Can I deduplicate this result?

This is the foundation for snapshots, replay verification, deduplication, branch comparison, cache keys, and audit trails.

Once a graph has deterministic identity, it becomes possible to freeze it and prove whether it changed later.


Snapshots: Freezing Cognitive State

Once a graph has deterministic identity, you can freeze it.

That is what a snapshot does.

A snapshot is not just a backup. A backup says:

Here is a copy of some data.

A Cognitive Graph snapshot says:

Here is a validated reasoning state,
with structure, content, and state hashes,
captured at this point in time.

That distinction matters.

A useful snapshot stores more than the graph payload. It should also store the hashes that prove what was captured:

@dataclass
class GraphSnapshot:
    snapshot_id: str
    graph_id: str

    structure_hash: str
    content_hash: str
    state_hash: str

    node_count: int
    edge_count: int
    branch_count: int
    artifact_count: int

    graph_state_json: dict

The graph_state_json field contains the frozen state needed for restoration or inspection. The hashes tell us whether that state still matches what was originally captured.

In other words:

payload = what we froze
hashes  = proof of what we froze

A snapshot should also have provenance.

The system should know why it was created:

before canonicalization
before merging branches
after human approval
before applying a patch
after completing a review session

Without provenance, snapshots become anonymous checkpoints. They may still be technically valid, but they are harder to interpret.

A simple provenance event might look like this:

@dataclass
class ProvenanceEvent:
    event_id: str
    graph_id: str
    event_type: str
    actor: str
    payload: dict


ProvenanceEvent(
    event_id="event_001",
    graph_id="graph_123",
    event_type="snapshot.created",
    actor="system",
    payload={
        "snapshot_id": "snap_001",
        "reason": "before_branch_merge",
        "state_hash": "sha256:..."
    },
)

Now the system can answer not only:

What was the graph state?

but also:

Why was this state preserved?

That is important for auditability.


Restoration: Proving the Snapshot Is Executable

A snapshot becomes much more powerful when it can be restored.

The basic loop is:

snapshot payload
→ restore graph
→ recompute hashes
→ compare with stored hashes

If the hashes match, the snapshot is not merely descriptive. It is executable.

A minimal restoration check might look like this:

def verify_restored_snapshot(snapshot):
    restored_graph = restore_graph(snapshot.graph_state_json)

    return {
        "structure_match": hash_structure(restored_graph) == snapshot.structure_hash,
        "content_match": hash_content(restored_graph) == snapshot.content_hash,
        "state_match": hash_state(restored_graph) == snapshot.state_hash,
    }

This lets the system detect corruption, drift, or incomplete restoration.

The important invariant is:

snapshot payload
→ restored graph
→ same hashes

If that invariant holds, the snapshot can be trusted.

There is one subtle point.

If the state hash includes the graph’s storage identity, then restoring into the original graph and restoring into a new graph are not the same operation.

Restoring into the same graph can produce full equality:

structure match = true
content match   = true
state match     = true

Restoring into a new graph may still preserve structure and content, but the exact runtime identity changes:

structure match = true
content match   = true
state match     = false

That is not a failure.

It is an honest distinction.

The new graph may contain the same reasoning, but it is not the same runtime state. That is why separating structure, content, and state hashes matters.

Snapshots turn reasoning into something you can checkpoint.

A human or agent can experiment freely, knowing that the graph can return to a known valid state.

The next step is to record not just the frozen states, but the mutations that move the graph from one state to another.


Mutation: The Unit of Cognitive Change

Before we can talk about event sourcing, we need to define what actually changes.

In a Cognitive Graph, the basic unit of change is a mutation.

A mutation is any operation that changes the graph’s cognitive state.

For example:

add a node
link two nodes
fork a branch
record a decision
canonicalize an artifact
attach a rationale
restore a snapshot

These are not just database writes.

They are cognitive transitions.

A node being added might mean a new idea entered the system. An edge being linked might mean two thoughts were related. A branch being forked might mean a new line of reasoning began. A decision being recorded might mean one option survived and another was rejected.

So the mutation is the point where reasoning changes shape.

A simple mutation function might look like this:

def append_node(graph, node_type, content):
    node = Node(
        id=new_id(),
        type=node_type,
        content=content,
        metadata={},
    )

    graph.nodes.append(node)

    return node

That works for a local graph.

But it does not answer the important questions:

Who made this change?
What state existed before it?
What state existed after it?
Can this change be replayed?
Can this change be audited?
Did this change happen directly, or through a recorded event?

Those questions matter because a Cognitive Graph is not just a container.

It is trying to preserve the evolution of reasoning.

So a mutation needs more structure.


From Mutation to Transition

A better model treats a mutation as a transition between two cognitive states:

previous graph state
→ mutation
→ next graph state

That transition should be inspectable.

At minimum, it should carry:

operation type
actor
payload
previous state identity
resulting state identity

In code, that starts to look like this:

@dataclass(frozen=True)
class GraphMutation:
    mutation_id: str
    graph_id: str
    operation: str
    actor: str
    payload: dict
    parent_state_hash: str | None
    resulting_state_hash: str | None

Now a mutation does not just say:

a node was added

It says:

this actor added this node
to this graph
from this previous state
producing this resulting state

That is the bridge from graph editing to cognitive auditability.

If graph state is changed directly, the system can see the final result but not necessarily the path.

For example:

graph.nodes.append(node)

After this, the graph has a new node.

But the system may not know:

why the node was added
which workflow added it
what state existed before
whether this addition can be replayed
whether it bypassed validation

Direct mutation is easy.

But direct mutation is also where reasoning history disappears.

That is why the mutation itself has to become explicit.

A Cognitive Graph should not only store the result of a change. It should store the change as a first-class object.

This is where mutation events enter.


Mutation Events: Recording Cognitive Transitions

A mutation event is the durable record of a cognitive transition.

mutation intent
→ mutation event
→ event applier
→ new graph state

Once mutation events exist, the graph can stop asking only:

What is true now?

and start asking:

What changed?
Who changed it?
Why did it change?
Can we replay the change?
Did the resulting state match the expected hash?

Different domains will have different mutations, but the core set is small.

For a general Cognitive Graph, the first useful mutation events are:

graph.created
node.appended
edge.linked
branch.forked
artifact.canonicalized
decision.recorded
reasoning.recorded
snapshot.restored

Each mutation has a payload.

For example, a node mutation might contain:

{
    "node_id": "node_123",
    "node_type": "revision",
    "content": "The revised paragraph goes here.",
    "branch_id": "branch_004",
}

An edge mutation might contain:

{
    "edge_id": "edge_456",
    "source_node_id": "node_123",
    "target_node_id": "node_789",
    "edge_type": "replaces",
}

A decision mutation might contain:

{
    "decision_id": "decision_001",
    "decision_type": "accepted",
    "target_node_id": "node_789",
    "rejected_node_ids": ["node_123", "node_456"],
    "rationale": "Clearer, shorter, and more aligned with the project voice.",
}

Each payload describes not just data, but intent.

It says what kind of cognitive transition occurred.

The rule is simple:

If a change matters to the reasoning process, it should be a mutation.

A variant being rejected matters. A branch being abandoned matters. A test result being attached matters. A rationale being recorded matters. A constraint being applied matters.

If these changes are not captured, the final artifact becomes harder to explain.

Snapshots preserve states.

Mutations explain how one state became another.

Events make those mutations durable and replayable.


Event-Sourced Cognition

Snapshots gave us checkpoints.

But checkpoints only tell you where the graph was.

They do not fully explain how the graph got there.

A snapshot can say:

Here is the reasoning state at this moment.

But it cannot fully answer:

What changed before this snapshot?
Which branch introduced the new idea?
Which decision selected this variant?
Which event caused the replay mismatch?
Which mutation moved this graph from one cognitive state to another?

To answer those questions, we need the thing that lives between snapshots:

events

A Cognitive Graph becomes much more powerful when every change to the graph is represented as an append-only event.

At first, the graph is state-first:

mutate graph state
→ optionally record what happened

That is useful, but it is not enough for replayable cognition.

The stronger model is event-first:

append event
→ apply event
→ derive graph state

In this model, the event is the authoritative fact. The graph state is the result of applying the event.

A simple event might look like this:

@dataclass(frozen=True)
class MutationEvent:
    event_id: str
    graph_id: str
    sequence_number: int
    event_type: str
    actor: str
    payload: dict
    parent_state_hash: str | None
    resulting_state_hash: str | None

For example:

MutationEvent(
    event_id="evt_042",
    graph_id="graph_123",
    sequence_number=42,
    event_type="node.appended",
    actor="editor",
    payload={
        "node_id": "node_revision_17",
        "node_type": "revision_text",
        "content": "The revised sentence goes here.",
        "branch_id": "branch_chapter_04",
    },
    parent_state_hash="sha256:before...",
    resulting_state_hash="sha256:after...",
)

That one event contains a cognitive transition:

before state
→ node.appended
→ after state

Now the graph can answer not just:

What is the current state?

but:

What sequence of changes produced this state?

That is the turn from memory into runtime.


Event Types as Cognitive Operations

A small event vocabulary can express a surprising amount of reasoning:

Event Meaning
node.appended A new unit of thought entered the graph
edge.linked Two thoughts were related
branch.forked A new reasoning path began
artifact.canonicalized Some output survived the process
decision.recorded An actor selected, rejected, deferred, or promoted something
reasoning.recorded The system preserved why a decision was made

That gives the system a language of cognitive mutation.

Not just:

save this text

but:

append this thought
connect these ideas
fork this line of reasoning
promote this artifact
record this decision
explain this decision

Applying Events

An event by itself is only a record.

To make it executable, we need an applier.

The applier takes an event and mutates graph state deterministically.

class EventApplier:
    def apply(self, graph, event: MutationEvent):
        if event.event_type == "node.appended":
            graph.add_node(**event.payload)

        elif event.event_type == "edge.linked":
            graph.add_edge(**event.payload)

        elif event.event_type == "branch.forked":
            graph.add_branch(**event.payload)

        elif event.event_type == "artifact.canonicalized":
            graph.add_artifact(**event.payload)

        else:
            raise ValueError(f"Unsupported event: {event.event_type}")

The implementation can be more sophisticated, but the invariant is simple:

same initial state
+ same ordered event sequence
= same graph state

That invariant is what makes replay possible.

Replay means rebuilding the graph from the event log:

def replay(events):
    graph = CognitiveGraph()

    for event in sorted(events, key=lambda e: e.sequence_number):
        EventApplier().apply(graph, event)

    return graph

If replay works, then the event log is not just history. It is executable memory.

event stream
→ replay
→ reconstructed graph

A reasoning process can now be inspected, tested, compared, and restored from first principles.


Replay vs Snapshot: The Trust Boundary

This is where snapshots and events meet.

A snapshot says:

Here is the graph state we froze.

The event stream says:

Here is how the graph reached that state.

If both are correct, they should agree.

    flowchart LR
    %% Event replay path
    A[📜 Event Stream] --> B[🔄 Replay Events]
    B --> C[🧠 Reconstructed Graph]

    %% Snapshot restore path
    D[📸 Snapshot Payload] --> E[📂 Restore Snapshot]
    E --> F[🧠 Restored Graph]

    %% Hash computation
    C --> G[🔢 Compute Hashes]
    F --> H[🔢 Compute Hashes]

    %% Comparison
    G --> I{⚖️ Hashes Match?}
    H --> I

    %% Outcomes
    I -->|✅ Yes| J[🎉 Replay Verified]
    I -->|❌ No| K[⚠️ Drift Detected]

    %% Styles
    classDef eventPath fill:#e3f2fd,stroke:#1565c0,stroke-width:2px,color:#000
    classDef snapshotPath fill:#e8f5e9,stroke:#2e7d32,stroke-width:2px,color:#000
    classDef compute fill:#fff3e0,stroke:#ef6c00,stroke-width:2px,color:#000
    classDef decision fill:#f3e5f5,stroke:#6a1b9a,stroke-width:2px,color:#000
    classDef positive fill:#c8e6c9,stroke:#1b5e20,stroke-width:2px,color:#000
    classDef negative fill:#ffcdd2,stroke:#b71c1c,stroke-width:2px,color:#000

    class A,B,C eventPath
    class D,E,F snapshotPath
    class G,H compute
    class I decision
    class J positive
    class K negative
  

The important proof is:

replay(events) == restore(snapshot)

Not by manually inspecting the graph.

By comparing deterministic hashes:

structure hash
content hash
state hash

If the hashes match, the reasoning history is executable and consistent.

If they do not, something drifted:

an event was lost
an event was applied differently
the snapshot payload changed
the hash projection changed
state was mutated outside the event stream

That last case matters most.

Replay-vs-snapshot equivalence can detect hidden mutation.


Event-First Execution

Once replay worked, the next step was to make events authoritative.

So we inverted the mutation model.

Before:

mutate state
→ emit event

After:

emit event
→ apply event
→ derive state

A simplified flow looks like this:

def append_node(graph_id, node_type, content):
    parent_hash = compute_state_hash(graph_id)

    event = append_event(
        graph_id=graph_id,
        event_type="node.appended",
        payload={
            "node_id": new_id(),
            "node_type": node_type,
            "content": content,
        },
        parent_state_hash=parent_hash,
        resulting_state_hash=None,
    )

    apply_event(event)

    resulting_hash = compute_state_hash(graph_id)

    update_event_resulting_hash(event.event_id, resulting_hash)

    return get_node(event.payload["node_id"])

This does two things.

First, every mutation becomes observable.

Second, every mutation becomes replayable.

The graph is no longer changed by invisible service calls. It is changed by events that can be inspected, ordered, hashed, and replayed.

The central invariant becomes:

events become authoritative
state becomes derived

That is the point where a Cognitive Graph stops being a memory system and starts acting like infrastructure.


A Concrete Runtime Trace

Here is the architecture in one small trace.

Suppose a user asks the system to improve a paragraph. The system generates a revision, the user accepts it, and the graph records the reasoning path.

1. append_event(type="node.appended", payload={...})
2. EventApplier applies the event
3. node is created in graph state
4. compute_state_hash() returns sha256:8f3a...
5. update_event_resulting_hash(event_id, sha256:8f3a...)
6. create_snapshot() stores the graph payload and hashes
7. replay_events() rebuilds the graph
8. replay-vs-snapshot compares hashes
9. verified=True

A simplified event stream might look like this:

events = [
    Event(
        type="node.appended",
        payload={
            "node_id": "intent_1",
            "node_type": "intent",
            "content": "Make this paragraph clearer.",
        },
    ),
    Event(
        type="node.appended",
        payload={
            "node_id": "draft_a",
            "node_type": "draft",
            "content": "The system keeps previous messages.",
        },
    ),
    Event(
        type="node.appended",
        payload={
            "node_id": "draft_b",
            "node_type": "revision",
            "content": "The system preserves how decisions evolve over time.",
        },
    ),
    Event(
        type="edge.linked",
        payload={
            "source_node_id": "draft_a",
            "target_node_id": "draft_b",
            "edge_type": "replaces",
        },
    ),
    Event(
        type="decision.recorded",
        payload={
            "decision_type": "accepted",
            "target_node_id": "draft_b",
            "rejected_node_ids": ["draft_a"],
            "rationale": "Clearer, shorter, and closer to the intended meaning.",
        },
    ),
]

That event stream tells us:

what the user wanted
what the first attempt was
what replaced it
which option was accepted
which option was rejected
why the choice was made

The final paragraph is no longer isolated. It is the endpoint of a replayable reasoning path.


A Small Example: Choosing a Better Paragraph

Here is the whole idea in a small example.

Suppose the task is:

Make this paragraph clearer.

The system might produce two variants. The user accepts the second one because it is shorter and easier to read.

A normal tool stores only the final paragraph.

A Cognitive Graph stores the process:

events = [
    Event(
        type="node.appended",
        payload={
            "node_id": "intent_1",
            "node_type": "intent",
            "content": "Make this paragraph clearer.",
        },
    ),
    Event(
        type="node.appended",
        payload={
            "node_id": "draft_a",
            "node_type": "draft",
            "content": "The system keeps a record of previous messages.",
        },
    ),
    Event(
        type="node.appended",
        payload={
            "node_id": "draft_b",
            "node_type": "revision",
            "content": "The system preserves how decisions evolve over time.",
        },
    ),
    Event(
        type="edge.linked",
        payload={
            "source_node_id": "draft_a",
            "target_node_id": "draft_b",
            "edge_type": "replaces",
        },
    ),
    Event(
        type="decision.recorded",
        payload={
            "decision_type": "accepted",
            "target_node_id": "draft_b",
            "rejected_node_ids": ["draft_a"],
            "rationale": "Clearer, shorter, and closer to the intended meaning.",
        },
    ),
]

That event stream tells us:

what the user wanted
what the first attempt was
what replaced it
which option was accepted
which option was rejected
why the choice was made

Later, the system can replay the event stream and reconstruct the same graph.

That is the difference between storing a paragraph and preserving the reasoning that produced it.


Decisions Are Cognitive Events

Capturing suggestions is not enough.

The human decision process also needs to become graph state.

A decision can record:

accepted
rejected
deferred
preferred
merged
promoted
archived

A variant comparison can record:

variant A
variant B
comparison dimension
preferred option
rationale

This is where the graph becomes collaborative.

It no longer captures only what the AI generated.

It captures what the human chose.

That matters because the user’s decision is not metadata. It is part of the cognitive process.

A final artifact is not just the last output in a sequence.

It is the survivor of decisions.


Why Graphs: Capturing the Reason Behind the Decision

A Cognitive Graph should not only store what happened.

It should store why a decision was made.

For that, we add a structured reasoning node.

A “why” object might contain:

target decision
preferred option
rejected options
principle basis
tradeoffs
uncertainty assessment
policy constraints
identity alignment notes
rationale
confidence
provenance hash

In code, the shape might be:

@dataclass(frozen=True)
class ConstitutionalReasoning:
    reasoning_id: str
    target_node_id: str | None
    target_decision_id: str | None
    preferred_option: str
    rejected_options: list[str]
    principles: list[str]
    tradeoffs: list[str]
    constraints: list[str]
    rationale: str
    confidence: float
    provenance_hash: str

This turns the graph from operational memory into epistemic memory.

Now it can answer:

Why did we accept this rewrite?
What principle won: clarity or voice?
Which alternatives were rejected?
What constraints mattered?
Do two decisions conflict?
Can we replay the justification?

That is a major shift.

The graph does not only preserve actions.

It preserves reasons.


Making the Graph Visible

A cognitive runtime is not useful if it remains invisible.

The system needs a way to inspect:

graphs
nodes
branches
review sessions
editorial decisions
variant comparisons
canonical artifacts
snapshots
replay status
lineage
reasoning nodes

The first interface should be read-only.

That is important.

Before a system edits cognitive state, it should make cognitive state understandable.

A useful dashboard lets the user answer:

What changed?
Which branch did this come from?
Which variants were rejected?
Which decision accepted this?
Why was it accepted?
Does replay still match the snapshot?

This is how hidden AI collaboration becomes inspectable.


What We Actually Built

At this point, the Cognitive Graph has become something larger than a writing feature.

It is an event-sourced runtime for structured reasoning.

The architecture now has direct analogues to distributed systems:

Distributed Systems Concept Cognitive Graph Equivalent
append-only log mutation events
state reconstruction replay
snapshots graph snapshots
deterministic rebuild replay-vs-snapshot verification
provenance event, decision, and reasoning lineage
branches cognitive branches
consistency checks structure/content/state hashes
event appliers cognitive state transitions

This is why “Cognitive Graph” is not just a name.

It is a real substrate.


From Generation to Continuity: The Real Shift

Most AI tools are optimized for generation.
Cognitive Graph is optimized for continuity.

That’s a deeper difference than it first appears.

Generation asks a single question:

Can I produce a useful answer?

Continuity asks a set of them:

Can I preserve how this answer came to exist?
Can I compare it to alternatives?
Can I replay the reasoning later?
Can I audit the decision that favored it?
Can I use it as a foundation tomorrow?

The diagram below makes the architectural split concrete:

    graph LR
    subgraph Generation Paradigm
        prompt[💬 Prompt] --> model[🤖 Model] --> answer[💡 Answer]
        style prompt fill:#f8d7da,stroke:#721c24
        style model fill:#f8d7da,stroke:#721c24
        style answer fill:#f8d7da,stroke:#721c24
    end

    subgraph Continuity Paradigm
        intent[🎯 Intent] --> cg[🧠 Cognitive Graph]
        cg --> branches[🌿 Branches & Variants]
        cg --> events[⚡ Mutation Events]
        cg --> snapshots[📸 Snapshots]
        cg --> decisions[✅ Decisions & Rationale]
        cg --> artifacts[🏆 Artifacts]
        snapshots --> replay[🔄 Replay & Verify]
        artifacts -->|feeds| intent
    end

    style intent fill:#d4edda,stroke:#155724
    style cg fill:#d4edda,stroke:#155724
    style branches fill:#d4edda,stroke:#155724
    style events fill:#d4edda,stroke:#155724
    style snapshots fill:#d4edda,stroke:#155724
    style decisions fill:#d4edda,stroke:#155724
    style artifacts fill:#d4edda,stroke:#155724
    style replay fill:#d4edda,stroke:#155724
  

On the left: a transient pipeline once the answer is emitted, the process evaporates.
On the right: a persistent, replayable reasoning substrate where every decision, alternative, and rationale is preserved.

This is what separates a chatbot from a cognitive runtime.


Beyond Writing: A General Architecture for Structured Decisions

Because the Cognitive Graph captures the evolution of thinking, not just text, its shape generalizes far beyond prose.

Any domain where decisions unfold over time benefits from the same memory, lineage, and replay:

  • Code review – patches, critiques, rejections, final merge.
  • Research synthesis – hypotheses, evidence, competing interpretations, consensus notes.
  • Legal reasoning – interpretations, precedents, accepted arguments, dissents.
  • Policy analysis – proposals, tradeoffs, principles, final positions.
  • Design review – mockups, feedback rounds, selection rationale.
  • Agent workflows – sub‑task branches, outcome comparisons, final plans. All right Kat Fox In each case, the core pattern is identical:

Structured evolving decisions, with memory.

A Cognitive Graph gives those decisions not just a record, but a provenance chain that can be replayed, verified, and learned from.

That’s the foundation for a new kind of AI‑assisted work one where the reasoning doesn’t vanish when the answer arrives.


Conclusion: Auditable AI, Branchable Thought

The real promise of a Cognitive Graph is not that it creates a new kind of AI.

It is that it gives you a better way to work with the AI you already have.

Most AI conversations move forward and then disappear behind you. You can scroll back, but you cannot easily branch from a decision, replay a path, compare two alternatives, or ask why one answer survived and another did not.

A Cognitive Graph changes that.

It lets you audit the path:

What did we ask?
What did the system try?
What did we reject?
What did we accept?
What changed after that?

It also lets you branch the path:

What if we went back to this earlier decision?
What if we explored the code version instead of the prose version?
What if another model reviewed the rejected branch?
What if we resumed from the moment before the final artifact was chosen?

That is the practical value.

The graph turns an AI session from a single forward-moving transcript into a structure you can revisit, inspect, fork, and extend.

You can go deeper at any stage of the conversation. You can preserve failed branches instead of losing them. You can ask the AI to research an alternate path without destroying the current one. You can compare the branch you took with the branch you abandoned. You can return later and understand why the work ended where it did.

That is what makes the Cognitive Graph useful.

It gives AI collaboration memory with handles.

Not just a record of what was said, but a replayable map of where thought could have gone, where it did go, and why one path became real.

What comes next: Graphs add structured reasons to those branch points. They make it possible to audit not only what changed, but why the system or the human chose one direction over another. That is where branching, replay, and justification start to work together.


Core Glossary

Cognitive Graph

A graph-based structure for preserving how thoughts, decisions, alternatives, and artifacts evolve over time.

Node

A unit of thought or work inside the graph. Examples include an intent, draft, variant, critique, decision, artifact, or reasoning object.

Edge

A typed relationship between two nodes. Examples include refines, replaces, supports, rejects, justifies, and accepted_into.

Branch

An alternate reasoning path within the graph.

Mutation

A meaningful change to the graph’s cognitive state.

Mutation Event

An append-only record of a mutation. It stores what changed, who changed it, and the state before and after.

Snapshot

A frozen, validated graph state captured at a point in time.

Replay

The process of rebuilding graph state from the event stream.

State Hash

A deterministic hash of exact runtime state.

Why Graph

The part of the Cognitive Graph that stores structured reasons behind decisions.


References and Further Reading

This post sits at the intersection of several existing ideas: event sourcing, provenance, graph-based memory, constitutional reasoning, and human-AI collaboration. The Cognitive Graph is not identical to any one of these, but it borrows useful patterns from each.

Event Sourcing and Replayable State

Martin Fowler’s writing on Event Sourcing is the clearest starting point for the software architecture behind the Cognitive Graph. The key idea is that state changes are stored as a sequence of events, and system state can be reconstructed by replaying those events. That maps directly onto the Cognitive Graph idea of mutation events, replay, and snapshot verification. (martinfowler.com)

Fowler’s broader article on event-driven systems is also useful because it explains the important distinction between logging things that happened and treating events as the source of truth. The Cognitive Graph follows the latter idea: graph state becomes derived from events rather than merely accompanied by events. (martinfowler.com)

CQRS is also relevant, especially the separation between command-side mutation and query-side read models. The Cognitive Graph does something similar: event-first mutation produces graph state, while traversal, lineage, replay, and dashboard views operate as query-side interpretations. (martinfowler.com)

Provenance and Auditability

The W3C PROV-O ontology is a useful conceptual reference for provenance. PROV-O provides a formal way to describe entities, activities, agents, and the relationships between them. Cognitive Graph is not a PROV-O implementation, but its ideas around provenance events, actor attribution, and derivation are closely related. (W3C)

The most important shared principle is that outputs are not enough. A trustworthy system should also preserve information about how an output was produced, what activity generated it, and which actor or system was responsible.

Constitutional Reasoning and “Why” Objects

Anthropic’s Constitutional AI paper is relevant to the “Why Graphs” direction. Constitutional AI uses explicit principles to guide model behavior through critique and revision. Cognitive Graph takes a runtime-oriented version of that idea: instead of only shaping model behavior during training, it stores principles, tradeoffs, rejected options, and rationales as explicit graph state. (arXiv)

More recent work on Collective Constitutional AI extends the idea by sourcing principles from broader groups of people. That connects naturally to the Cognitive Graph idea of multiple actors, competing rationales, variant comparison, and human-reviewable decision lineage. (arXiv)

How These Ideas Connect

The Cognitive Graph borrows from event sourcing the idea that history should be executable. It borrows from provenance systems the idea that artifacts should carry derivation and attribution. It borrows from knowledge graphs the idea that relationships matter. It borrows from constitutional reasoning the idea that decisions should be linked to principles.

The new piece is the combination:

a replayable graph of reasoning state, where decisions, alternatives, artifacts, and justifications are preserved as event-sourced cognition.