The pervasive “lost in the middle” phenomenon is a failure of semantic retrieval, not just context window capacity. While increasing token limits is tempting, standard Retrieval-Augmented Generation (RAG) pipelines depend on isolated chunk embeddings and generic vector similarity. As a result, they frequently bury critical technical dependencies deep within long prompts. If a system cannot reconstruct multi-hop relational paths during retrieval, it cannot deterministically ground the final inference, leading to “hallucinations in the middle.”
This article details a production-ready refactoring of the Unified Graph RAG architecture. We move beyond baseline RAG by treating context assembly as a software artifact with an auditable schema and strict lineage requirements. This approach consolidates full-text search (FTS), dense vectors (pgvector), and relational topologies (Apache AGE) within a single, ACID-compliant PostgreSQL engine.

Our core systems engineering hypothesis is that true data locality is the only way to enforce multi-model lineage. Fragmented database stacks, requiring synchronization across three infrastructure pieces, inevitably introduce distributed state drift, invalidating any audit trail required for regulatory compliance (e.g., EU AI Act, Article 12). By unifying storage, we replace probabilistic grounding with computable, deterministic verifiability.
This is the system architecture optimized for Markdown rendering:
graph TD
A[Raw Ingestion Document Stack] --> B[Globally-Aware Embedding Engine]
A --> C[Apply targeted Late Chunking matrices]
A --> D[PostgreSQL Apache AGE loader]
B --> E(PostgreSQL Multi-Engine Single Footprint)
C --> E
D --> E
E --> F[native Full-Text keyword Index]
E --> G[pgvector Dense Storage]
E --> H[Apache AGE Directed Graph]
G -.->|Article 12 Lineage Path| HThe Auditing and Observability Problem
Our baseline testing proved that consolidating storage footprint increased the $F_1$ verification score from $0.61$ to $0.89$. However, this improved score lacked an auditable system of record. When the inference agent—modeled here by the PydanticAI robot—occasionally referenced a low-weight graph edge instead of a primary document fact, traditional APM logs offered no trace of the conflicting pathway. This makes debugging distributed state drift, where vector updates lag behind relational node edits, operationally impossible.
To scale the PoC, we must audit which specific graph path or vector chunk supports any given grounded assertion. We achieve this by deploying a dedicated auditing schema alongside our core application tables, converting the context assembly process into a traceable software artifact.
Enforcing Deterministic Evidence Lineage
We refactored our unified context assembly engine to generate comprehensive, immutable Evidence Path Manifests for every inference run. The system wraps every retrieved context artifact whether a raw FTS document, a specific late-chunked vector or a 3-hop AGE dependency path in this audited schema.
The architectural flow is refactored to enforce that a grounded response cannot be generated without an accompanying, immutable manifest. This flowchart defines the dependency structure required for this verification.
The Auditing and Observability Problem
Our baseline testing (detailed in previous posts) proved that consolidating storage footprint increased the $F_1$ verification score from $0.61$ to $0.89$. However, this improved score lacked an auditable system of record. When the inference agent modeled here by the PydanticAI robot (from Image 1) occasional referenced a low-weight graph edge instead of a primary document fact, traditional APM logs offered no trace of the conflicting pathway. This makes debugging distributed state drift, where vector updates lag behind relational node edits, operationally impossible.
To scale the PoC, we must audit which specific graph path or vector chunk is responsible for any given grounded assertion. We achieve this by deploying a dedicated auditing schema alongside our core application tables, converting the context assembly process into a traceable software artifact.
graph TD
A[User Raw Query Q-987] --> B(Unified Query Orchestration Layer)
B --> C(Unified Multi-Engine SQL execution)
C -->|pgvector HNSW match| D[chunks table]
C -->|FTS match| E[documents table]
C -->|Apache AGE Cypher match| F[graph topology table]
D --> G(Weave auditable Context Manifest TID-X1)
E --> G
F --> G
G --> H[JSONB Evidence Path Manifest TID-X1]
%% Fixed Note for H
H -.-> NoteH["Note: Programmable
grounding compliance check"]
style NoteH fill:#fff3cd,stroke:#ffeba0,stroke-width:1px
H --> I[PydanticAI Grounded Agent]
%% Fixed Note for I
I -.-> NoteI["Note: Meter seen in
visual dashboards"]
style NoteI fill:#fff3cd,stroke:#ffeba0,stroke-width:1px
I --> J[Verified Response & Citations]
J -.->|hallucination detected| K(raise Article 15 inaccuracy flag)
Advanced Multi-Engine Architecture for Regulatory Compliance
The EU AI Act mandates strict traceability (Article 12) and transparency (Article 13) for high-risk systems. For multi-model context builders, a probabilistic citation (“hallucinated citation”) does not meet this standard. The context assembly process itself must be sufficiently transparent so that engineers can determine “the reason for the output” (Article 13).
Our architecture refactors this process into a deterministic systems engineering problem. The database itself becomes the central governance layer, housing traditional inverted indexes, multi-dimensional floats, and strict property graph topologies.
Single PHYSICAL Footprint: The Multi-Engine Schema
By leveraging PostgreSQL multi-engine architecture, true data locality is achieved. Queries can simultaneously execute keyword metric checks, dense vector distance checks, and exact relational graph path traversals without cross-network joins or coped distributed consistency lag.
The flowchart below traces the automated grounding verification path enforced in the unified schema.
graph TD
A[User Raw Query Q-987] --> B(Unified Query Orchestration Layer)
B --> C(Unified SQL query execution TID-X1)
subgraph PostgreSQL Consolidated Physical Boundary
C --> D[FTS Index]
C --> E[pgvector]
C --> F[Apache AGE]
end
D --> G(Generate Auditable decision tree)
E --> G
F --> G
G --> H[JSONB Evidence Path Manifest TID-X1]
%% Fixed Note for H
H -.-> NoteH["Note: Operates on Manifest TID-X1"]
style NoteH fill:#fff3cd,stroke:#ffeba0,stroke-width:1px
H --> I[PydanticAI Grounded Agent]
I -->|Article 15 verified response| J[Verified Response & Citations]
I -.->|grounding verification fails| K(SET verification_passed=FALSE Flag)
J --> L["Update Audit Trail (Annex IV Documentation)"]
K -.-> LGrounding as Verification and Citation Governance (Article 15)
High-risk systems must possess “a high level of… accuracy… through appropriate metrics” throughout their lifecycle. A generic LLM cannot fulfill this. We utilize the PydanticAI library to act as an automated verification loop. The agent is constrained by a strict system prompt: “Answer only from the provided retrieved evidence and graph context.”
This instruction converts the grounding requirement into a testable accuracy verification metric. We utilize the evidence path manifest to enforce governance: the agent can audit before output if its generated citations map back to factually valid retrieval paths stored in the manifest, programmatically stopping Terminology Misuse Hallucinations (e.g., using terms found in the context but connecting them with unverified paths).
Conclusion: The Path to Verified Compliance
Deploying Unified Graph RAG on PostgreSQL is not a shallow performance optimization; it is the construction of a deterministic data foundation required for regulatory compliance and deterministic grounding. For high-risk systems, black box context builders that cannot provide full observability and auditing are operationally unacceptable for 2026.
By consolidating all multi-model storage within an ACID-compliant physical boundary, we eliminate the distributed state drift that plague fragmented stacks. By enforcing a deterministic decision tree via auditable manifests, we convert grounding from an aspiration into a verifiable systems engineering output. The result is not just a Graph RAG pipeline; it is the immutable system of record required for high-risk AI governance.
The final system dependency diagram highlights how the multi-engine storage footprint feeds directly into the compliance auditing loop.
graph TD
A[Raw Document Stack] --> B[targeted Ingestion stack]
B --> C[PostgreSQL Multi-Engine Physical Boundary]
C -->|Compute Globes-aware embeddings| D[FTS Index]
C -->|targeted late chunking matrices| E[pgvector]
C -->|Apache AGE loader| F[Apache AGE Directed Graph]
G[Context Assembler & Audit Engine] --> E
G --> F
%% Fixed Note for G
G -.-> NoteG["Note: Weaving auditable lineage manifest..."]
style NoteG fill:#fff3cd,stroke:#ffeba0,stroke-width:1px
G --> H[JSONB Evidence Path Manifest TID-X1]
H --> I[EU AI Act Audit Log]
%% Fixed Note for H
H -.-> NoteH["Note: Operates on Manifest J"]
style NoteH fill:#fff3cd,stroke:#ffeba0,stroke-width:1px
H --> J[Grounded Agent PydanticAI]
J --> K[Verified Response & Citations]
J -.->|hallucination detected| L(raise Article 15 inaccuracy flag)
J -.-> I