Process graph vs document graph

Every AI search and knowledge tool on the market today indexes documents. Glean indexes documents. Notion AI indexes documents. Atlassian Rovo indexes documents. Microsoft Copilot indexes documents.

The whole category of enterprise AI knowledge tools is built on the same fundamental architectural choice: treat each page, message, or file as a unit to be searched.

This worked for first generation AI search. It is wrong for second generation team AI.

This article explains why, and what to build instead.

Hero · Article 02

Document graph vs process graph

Flat document index

Embedding search across blobs

No types. No relations. Embedding similarity only. A page is a page is a page.

Process graph

Decision · Commitment · Failure · Outcome

Typed entities with explicit relationships. Decisions link to their evidence; commitments link to their owners.

Figure 01 · Pulse design system

The document graph approach: how today’s AI tools work

When you ask Glean or Notion AI a question, here is what happens under the hood.

Your question gets converted into an embedding vector. The system searches its index for documents whose embeddings are semantically similar to your question. It pulls the top matching documents (or chunks of documents), passes them to a language model, and the language model synthesizes an answer with citations to the matched documents.

This pattern is called retrieval augmented generation (RAG). It is the dominant architecture for enterprise AI search. It works. For simple questions about specific facts in specific documents, RAG returns useful answers.

But this approach treats every piece of information as fundamentally the same type: a document. A Slack message is a document. A Notion page is a document. A Linear ticket comment is a document. A meeting transcript is a document.

This flattening is the structural problem. Real team work does not consist of documents. It consists of decisions, commitments, disagreements, failures, and outcomes. These have different properties. They have lifecycles. They relate to each other in specific ways. When you flatten all of them into “documents to be searched,” you lose the structure that makes the information valuable in the first place.

Architecture

Document graph approach

User query: “What did we decide?”

↓

Search index · uniform blobs

Top K embedding similarity. No entity types.

↓

LLM

Generates answer with citations

Figure 02 · Pulse design system

What gets lost in document graph thinking

Consider a real team scenario. Last quarter, your engineering team had a long debate about whether to migrate from MongoDB to Postgres. The debate happened in a Slack thread with 47 messages, evolved through three follow up conversations, got documented in a Notion page, and resulted in a Linear epic with subtasks.

In a document graph world, this entire decision exists as 50+ separate documents. The Slack thread is documents. The Notion page is a document. The Linear epic and subtasks are documents. The team’s lived experience of “we made this decision” gets dissolved into a flat pile of artifacts.

When a new engineer joins four months later and asks “why did we choose Postgres?”, the document graph approach searches for “Postgres” across all these artifacts and tries to synthesize an answer. It might find the Notion page. It might find some Slack messages. The answer it produces is a best effort reconstruction of what the team must have decided.

What is lost:

The decision itself as a first class concept (with the rationale, the evidence considered, the decision makers, and the outcome over time)
The commitments that emerged from the decision (who promised what for the migration)
The disagreements that were raised and resolved (with the resolution recorded)
The failure cases the team explicitly chose to accept (knowing the tradeoffs)

These are not documents. They are structured concepts that the team understood at the time of the decision. Document graph architecture has no way to capture them as such. It can only capture the documents that mention them, and hope the language model can reconstruct the structure from the unstructured artifacts.

This works sometimes. It fails the rest of the time. And when it fails, the team does not notice the failure: they just get answers that feel plausible but miss something the team would have remembered.

The process graph approach

A process graph is a different architectural choice. Instead of indexing documents, the system extracts the structured concepts the team is actually working with: decisions, commitments, failures, features, risks, disagreements. Each of these is a first class entity type with its own properties, relationships, and lifecycle.

When a Slack thread happens, the process graph approach does not just index the messages. It reads the conversation and extracts:

The Decision entity (with rationale, evidence, decision makers, status, and outcome trail)
Any Commitment entities created (with owner, recipient, deadline, status)
Any Failure entities referenced (the failure cases considered, lessons applied)
The Disagreement entity if relevant (with positions, resolution, and history)

These entities are stored as structured data with explicit relationships to each other. A Decision links to its Evidence. A Commitment links to its Owner and to the Decision it implements. A Failure case links to similar past cases that informed the decision.

When the new engineer asks “why did we choose Postgres?” four months later, the process graph approach returns the Decision entity directly. The rationale is structured. The evidence is linked. The decision makers are named. The commitments that emerged from the decision are tracked, with their current status. The failure cases considered are accessible.

Architecture

Process graph approach

User query: “What did we decide?”

↓

Typed entity graph

Decision · Commitment · Failure · Outcome (linked by explicit relations)

↓

Structured decision entity

rationale · evidence · decision makers · status

Figure 03 · Pulse design system

What process graph architecture unlocks

Three capabilities emerge from this architectural choice that document graph systems cannot easily replicate.

Decision archaeology.When someone asks “what did we decide about X?”, you get the Decision entity, not a search result list. You see the rationale that was true at the time, the evidence considered, and how the outcome compares to the prediction. This is what the team’s memory should feel like, and what humans naturally do when they remember important decisions.
Counterfactual replay.Because decisions are structured with their inputs and outputs, you can ask “what would have changed if we had known X at the time?” The system can reason about the decision in light of new information. Document graph systems can only retrieve the original document; they cannot reason about it as a structured object.
Pattern extraction. Because the same entity types appear across many situations (every customer refund creates similar Decision and Commitment entities; every incident creates similar Failure entities), you can detect patterns across time and across teams. This becomes the foundation for auto extracted Skills, which we cover in the cornerstone on Skills compilation.

These capabilities do not exist in document graph systems. They are not features that can be bolted on. They require the underlying data model to be structured as a process graph from the beginning.

Why incumbents cannot easily ship this

If process graph is so superior, why has Glean not shipped it? Why does every enterprise AI tool still index documents?

The answer is architectural lock in. Glean has built one of the most sophisticated document indexes in the industry. Their entire data layer, retrieval pipeline, query system, and security model is designed around the document as the primitive. Adding entity extraction on top of this is possible but secondary; rebuilding around entities would mean rebuilding the product from scratch.

The same is true for Notion AI (which is structurally tied to Notion pages as the primitive), Microsoft Copilot (tied to Office documents and emails), and Atlassian Rovo (tied to Atlassian artifacts). Each of these products has a successful business built on document graph architecture. None of them can easily migrate.

This is the structural opportunity for new entrants. Pulse, and other teams building from scratch, can choose the process graph architecture from day one. We accept the slower initial build (entity extraction is harder than document indexing) in exchange for a permanent architectural advantage.

Comparison

Document graph vs process graph

Dimension	Document graph	Process graph (Pulse)
Data primitive	Document blob	Decision · Commitment · Failure
Retrieval method	Embedding similarity	Typed entity lookup
Decision tracking	Implicit, reconstructed	Explicit, first class
Commitment tracking	None	Native (owner + deadline)
Pattern extraction	Difficult	Native
Migration cost	Already in place	Rebuild from scratch

Figure 04 · Pulse design system

What this means for buyers

If your team is evaluating AI knowledge tools, the question to ask vendors is not just “does this search our content?” The deeper question is “what is the data primitive?”

If the answer is “documents,” you are getting document graph architecture. This will produce reasonable answers to factual lookup questions (“what is our refund policy?”). It will produce mediocre answers to coordination questions (“why did we decide X?”). It will produce poor answers to pattern questions (“are there similar decisions in our history?”).

If the answer is “decisions, commitments, failures, and outcomes as typed entities,” you are getting process graph architecture. This will produce strong answers to coordination questions and enable capabilities (pattern extraction, counterfactual replay, calibrated decision tracking) that document graph systems structurally cannot provide.

Both architectures will continue to coexist for years. Document graph systems will get better at simulating process graph behavior using more capable language models. Process graph systems will mature and become the dominant pattern for team level AI. Eventually, the categories will merge: document graph systems will adopt entity extraction as a feature, and process graph systems will index documents as supplementary context.

But during the next several years of category formation, the architectural choice determines what is possible. The teams that pick the right architecture from the beginning will have permanent structural advantages.

Closing: why we made this bet

When we started building Pulse, we explicitly evaluated both approaches. Document graph would have been three months faster to ship. The demo would have looked similar. We would have been competitive on simple search questions on day one.

We chose process graph because it is the foundation for what comes next. Auto extracted Skills require the process graph to work. Calibrated decision tracking requires it. The compounding intelligence that makes team AI more useful over time requires it.

If your team runs on Linear, Notion, GitHub, and Slack, and you want AI that captures how your team actually works rather than just indexes the documents your team produces, the architecture matters.

Pulse is the first team AI built on a process graph foundation. The live demo at pulsehq.tech is walkable end to end without signup.