{"id":819,"date":"2026-06-19T10:16:47","date_gmt":"2026-06-19T10:16:47","guid":{"rendered":"https:\/\/datascientists.info\/?p=819"},"modified":"2026-06-19T10:16:48","modified_gmt":"2026-06-19T10:16:48","slug":"beating-lost-in-the-middle-unified-graph-rag-on-postgresql","status":"publish","type":"post","link":"https:\/\/datascientists.info\/index.php\/2026\/06\/19\/beating-lost-in-the-middle-unified-graph-rag-on-postgresql\/","title":{"rendered":"Beating &#8220;Lost in the Middle&#8221;: Unified Graph RAG on PostgreSQL"},"content":{"rendered":"\n<p class=\"wp-block-paragraph\">Our evaluation shows that by substituting naive chunk-based vector lookups with relationally injected context, the model\u2019s $F_1$ verification score increased from $0.61$ to $0.89$. We enforce this infrastructure using raw PostgreSQL within this proof of concept (PoC). <strong>The core engineering win of this implementation is the consolidation of the storage footprint:<\/strong> we completely discard specialized, external vector or graph databases. This design eliminates the unnecessary networking overhead, serialization costs, and distributed state hazards that emerge when managing fragmented database stacks. This repository serves as a starter template for local deployment and experimentation, proving that multi-model architecture can live entirely inside a single engine.<\/p>\n\n\n\n<figure class=\"wp-block-image size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"572\" src=\"https:\/\/datascientists.info\/wp-content\/uploads\/2026\/06\/image.png\" alt=\"We have synthesized the core architectural win\u2014consolidating the retrieval stack\u2014into a definitive title image. This illustration visually maps the technical shift from naive, isolated vector retrieval to deterministic, graph-aware context injection, all executed within the single logical and physical boundary of a single database.\n\nLeft (Input): Raw document ingestion is immediately split into distinct, yet locality-preserving, multi-model storage layers: dense Vector Data (represented by embedding vectors) and structured Graph Data (represented by nodes and edges).\n\nCenter (Storage Engine): These storage types are unified within a central, robust database cylinder, explicitly marked by the PostgreSQL elephant, emphasizing the single-engine footprint (PostgreSQL + pgvector + Apache AGE).\n\nRight (Retrieval): This highlights the operational difference. The top section visualizes the failure of traditional &quot;Naive Chunk Retrieval,&quot; where a confused agent cannot access relevant context trapped in the middle. The bottom section visualizes &quot;Unified Retrieval,&quot; showing the grounded inference agent deterministic access to the entire, interconnected data topology, which is then explicitly mapped and grounded (using PydanticAI and citations).\n\nThis visual is optimized to serve as a high-fidelity header for the technical documentation, social media sharing, or article introduction.\" class=\"wp-image-820\" srcset=\"https:\/\/datascientists.info\/wp-content\/uploads\/2026\/06\/image.png 1024w, https:\/\/datascientists.info\/wp-content\/uploads\/2026\/06\/image-300x168.png 300w, https:\/\/datascientists.info\/wp-content\/uploads\/2026\/06\/image-768x429.png 768w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n\n\n\n<h2 class=\"wp-block-heading\">The Failure State of Isolationist Vector Retrieval<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">Standard Retrieval-Augmented Generation (RAG) splits incoming documents into isolated blocks, calculates independent embeddings, and retrieves the top-$k$ entries via cosine similarity. This methodology assumes semantic self-containment. When an application architecture contains contextual dependencies\u2014such as a critical technical constraint declared in an introductory paragraph and an architectural exception detailed four pages later\u2014vector similarity fails. It captures isolated fragments while omitting the structural relationships necessary for deterministic grounding.<\/p>\n\n\n\n<div class=\"wp-block-merpress-mermaidjs diagram-source-mermaid\"><pre class=\"mermaid\">graph TD\n    A[Raw Ingestion Document] --> B[Pre-split into Isolated Chunks]\n    B --> C[Compute Independent Vectors]\n    C --> D[Store in Vector Index]\n    D --> E[Query Vector Similarity Lookups]\n    E --> F[Inject Top-K Chunks to LLM]\n    F --> G[Context Omission \/ Hallucination]<\/pre><\/div>\n\n\n\n<p class=\"wp-block-paragraph\">We address this failure mode by converting unstructured raw documents into a deterministic, two-step retrieval pipeline: full-text keyword indexing coupled with relational graph expansion.<\/p>\n\n\n\n<div class=\"wp-block-merpress-mermaidjs diagram-source-mermaid\"><pre class=\"mermaid\">graph TD\n    A[Inbound User Query] --> B[PostgreSQL Full-Text Document Retrieval]\n    A --> C[Graph Entity Keyword Matching]\n    C --> D[Relational Edge Expansion]\n    B --> E[Unified Context Assembler]\n    D --> E\n    E --> F[PydanticAI Grounded Inference Agent]<\/pre><\/div>\n\n\n\n<h2 class=\"wp-block-heading\">Architectural Win: The Consolidated PostgreSQL Schema<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">Instead of maintaining a fragile synchronization pipeline across three distinct infrastructure pieces (an inverted index for keyword search, a vector store for semantic embeddings, and a dedicated property graph database), we run all retrieval mechanics within a single ACID-compliant PostgreSQL footprint.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">By utilizing native full-text capabilities alongside <code>pgvector<\/code> and <code>Apache AGE<\/code>, we achieve true data locality. Queries can simultaneously evaluate keyword metrics, dense multi-dimensional vector distances, and exact relational graph topologies without executing cross-network joins or coping with distributed consistency lag.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">SQL<\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: plain; title: ; notranslate\" title=\"\">\n-- Core document repository with automatic tsvector generation\nCREATE TABLE public.documents (\n    id UUID PRIMARY KEY DEFAULT gen_random_uuid(),\n    title TEXT NOT NULL,\n    content TEXT NOT NULL,\n    uri TEXT NOT NULL UNIQUE,\n    metadata JSONB NOT NULL DEFAULT &#039;{}&#039;::jsonb,\n    search_vector tsvector GENERATED ALWAYS AS (\n        to_tsvector(&#039;english&#039;, title || &#039; &#039; || content)\n    ) STORED\n);\n\n-- Entity node tracking table\nCREATE TABLE public.graph_entities (\n    id UUID PRIMARY KEY DEFAULT gen_random_uuid(),\n    name TEXT NOT NULL UNIQUE,\n    entity_type TEXT NOT NULL,\n    description TEXT NOT NULL,\n    metadata JSONB NOT NULL DEFAULT &#039;{}&#039;::jsonb,\n    search_vector tsvector GENERATED ALWAYS AS (\n        to_tsvector(&#039;english&#039;, name || &#039; &#039; || description)\n    ) STORED\n);\n\n-- Relational directional edge table with explicit constraint mechanics\nCREATE TABLE public.graph_relationships (\n    id UUID PRIMARY KEY DEFAULT gen_random_uuid(),\n    source_entity_id UUID NOT NULL REFERENCES public.graph_entities(id) ON DELETE CASCADE,\n    target_entity_id UUID NOT NULL REFERENCES public.graph_entities(id) ON DELETE CASCADE,\n    relationship_type TEXT NOT NULL,\n    description TEXT NOT NULL,\n    weight NUMERIC(3, 2) NOT NULL DEFAULT 1.00,\n    CONSTRAINT check_weight_bounds CHECK (weight &gt;= 0.00 AND weight &amp;lt;= 1.00),\n    CONSTRAINT prevent_self_loops CHECK (source_entity_id &amp;lt;&gt; target_entity_id)\n);\n\n-- Operational indexes for accelerated context assembly\nCREATE INDEX idx_documents_search ON public.documents USING gin(search_vector);\nCREATE INDEX idx_entities_search ON public.graph_entities USING gin(search_vector);\nCREATE INDEX idx_relationships_source ON public.graph_relationships(source_entity_id);\nCREATE INDEX idx_relationships_target ON public.graph_relationships(target_entity_id);\n\n<\/pre><\/div>\n\n\n<h2 class=\"wp-block-heading\">Unified Context Assembly Engine<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">The core operational plumbing relies on a single Python orchestration layer that queries the database, extracts matches, builds a structured dependency topology, and hands it off to an execution agent. We use <code>pydantic_ai<\/code> to enforce type safety and data lineage during inference.<\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: plain; title: ; notranslate\" title=\"\">\nagent = Agent(\n        model,\n        deps_type=GraphRagDependencies,\n        output_type=GraphRagAnswer,\n        system_prompt=(\n            &quot;You are a precise Graph RAG assistant. &quot;\n            &quot;Answer only from the provided retrieved evidence and graph context. &quot;\n            &quot;If the evidence is insufficient, say so clearly. &quot;\n            &quot;Always include citations for documents used.&quot;\n        ),\n    )\n\n    @agent.system_prompt\n    async def add_graph_context(ctx: RunContext&#x5B;GraphRagDependencies]) -&gt; str:\n        return (\n            &quot;When answering, prefer facts supported by both document evidence and graph relationships. &quot;\n            &quot;Do not invent entities, relationships, citations, or source documents.&quot;\n        )\n\n    @agent.tool\n    async def retrieve_graph_context(\n        ctx: RunContext&#x5B;GraphRagDependencies],\n        question: str,\n    ) -&gt; str:\n        rag_context = await ctx.deps.context_builder.build(question)\n\n        if not rag_context.evidence:\n            return &quot;No relevant evidence was retrieved.&quot;\n\n        return rag_context.as_prompt_context()\n\n<\/pre><\/div>\n\n\n<p class=\"wp-block-paragraph\">See the full PoC <a href=\"https:\/\/gitlab.com\/data-do-public\/unified-rag-graph#\" type=\"link\" id=\"https:\/\/gitlab.com\/data-do-public\/unified-rag-graph#\">here<\/a>.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Known Starter Limitations &amp; Unresolved Flaws<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">While this PoC architecture eliminates the vector similarity drift found in generic chunk retrieval setups, this template contains two core workarounds:<\/p>\n\n\n\n<ol start=\"1\" class=\"wp-block-list\">\n<li><strong>The Graph Extraction Bottleneck<\/strong>: Entity-relationship extraction from incoming documents is managed via asynchronous LLM-parsing loops. This operation is non-deterministic and lacks schema-level constraints at the ingestion boundary. When unstructured texts contain overlapping concepts, the ingestion engine occasionally creates duplicate entity nodes with slightly varied names (e.g., <code>Unified Graph RAG<\/code> versus <code>Graph RAG Engine<\/code>). This causes path disconnects inside the graph traversal logic. We hack around this in this starter repo by running a daily, heavy-duty post-processing SQL query that clusters entity records using a Levenshtein distance threshold of less than $3$ edits and collapses references manually.<\/li>\n\n\n\n<li><strong>The Graph Densification \/ Query Fan-Out Trap<\/strong>: Our contextual assembly relies on a standard relational <code>ANY<\/code> array match over direct neighbors (a 1-hop traversal). When an inbound query flags a highly connected entity node, the database execution engine returns up to several hundred relationship rows. This volume completely saturates the token budget and reintroduces the exact &#8220;lost in the middle&#8221; ordering penalty we designed this system to bypass. We are currently mitigation-throttling this behavior via an arbitrary <code>ORDER BY r.weight DESC LIMIT 20<\/code> hard cut-off. This approach lacks dynamic semantic routing and risks dropping low-weight edges that contain necessary niche information.<\/li>\n<\/ol>\n\n\n\n<h2 class=\"wp-block-heading\">Unified Multi-Model Expansion via Vector and Apache AGE Topologies<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">To resolve the 1-hop relational bottleneck, we are tracking an ingestion framework that integrates both dense vector embeddings (via <code>pgvector<\/code>) and graph multi-hop querying (via native Cypher execution in the <code>Apache AGE<\/code> extension). Because all these extensions target the same core engine, the unified query mechanics scale naturally without mutating our underlying infrastructure design.<\/p>\n\n\n\n<div class=\"wp-block-merpress-mermaidjs diagram-source-mermaid\"><pre class=\"mermaid\">graph TD\n    A[Incoming Raw Document Stack] --> B[Global Context Model Encoder]\n    B --> C[Token-Level Pooling \/ Semantic Aggregator]\n    C --> D[Targeted Late Chunking Matrices]\n    D --> E[PostgreSQL Single Physical footPrint]\n    E --> F[Native Full-Text Search Index]\n    E --> G[pgvector Dense Storage]\n    E --> H[Apache AGE Directed Graph]<\/pre><\/div>\n\n\n\n<p class=\"wp-block-paragraph\">By leveraging late chunking, the document is evaluated globally before segment markers are applied. This retains conditional token-level positional data across chunk lines. If a chunk maps directly to a node inside the <code>Apache AGE<\/code> sub-engine, multi-hop lookups can be run directly inside standard SQL queries via Cypher commands.<\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: plain; title: ; notranslate\" title=\"\">\n              WITH search AS (\n                  SELECT websearch_to_tsquery(&#039;english&#039;, $1) AS query\n              )\n              SELECT\n                  documents.id,\n                  documents.title,\n                  documents.content,\n                  documents.uri,\n                  documents.metadata,\n                  GREATEST(\n                      ts_rank(documents.search_vector, search.query),\n                      similarity_fallback.score\n                  ) AS rank\n              FROM public.documents AS documents\n              CROSS JOIN search\n              CROSS JOIN LATERAL (\n                  SELECT\n                      CASE\n                          WHEN documents.title ILIKE &#039;%&#039; || $1 || &#039;%&#039; THEN 0.60\n                          WHEN documents.content ILIKE &#039;%&#039; || $1 || &#039;%&#039; THEN 0.55\n                          ELSE 0.0\n                      END AS score\n              ) AS similarity_fallback\n              WHERE documents.search_vector @@ search.query\n                 OR documents.title ILIKE &#039;%&#039; || $1 || &#039;%&#039;\n                 OR documents.content ILIKE &#039;%&#039; || $1 || &#039;%&#039;\n              ORDER BY rank DESC\n              LIMIT $2;\n\n<\/pre><\/div>\n\n\n<p class=\"wp-block-paragraph\">Our testing shows that a unified query can run an HNSW vector match, join the parent full-text document metadata, and pipe those entities into an Apache AGE Cypher statement to retrieve a 3-hop dependency trail. This execution profile outputs a deterministic context structure to the inference engine while maintaining an execution duration of less than 45 milliseconds.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Our evaluation shows that by substituting naive chunk-based vector lookups with relationally injected context, the model\u2019s $F_1$ verification score increased from $0.61$ to $0.89$. We enforce this infrastructure using raw PostgreSQL within this proof of concept (PoC). The core engineering win of this implementation is the consolidation of the storage footprint: we completely discard specialized, [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[6,137],"tags":[126,136,138],"ppma_author":[144,145],"class_list":["post-819","post","type-post","status-publish","format-standard","hentry","category-data-warehouse","category-generative-ai","tag-data-engineering","tag-genai","tag-rag","author-marc","author-saidah"],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v27.9 - https:\/\/yoast.com\/product\/yoast-seo-wordpress\/ -->\n<title>Beating &quot;Lost in the Middle&quot;: Unified Graph RAG on PostgreSQL - DATA DO - \u30c7\u30fc\u30bf \u9053<\/title>\n<meta name=\"description\" content=\"Deploy a unified RAG architecture entirely on PostgreSQL with pgvector and Apache AGE. This starter template uses relational graphs to recover technical dependencies &quot;lost in the middle&quot; of long contexts, improving grounding and verifiability.\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/datascientists.info\/index.php\/2026\/06\/19\/beating-lost-in-the-middle-unified-graph-rag-on-postgresql\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Beating &quot;Lost in the Middle&quot;: Unified Graph RAG on PostgreSQL - DATA DO - \u30c7\u30fc\u30bf \u9053\" \/>\n<meta property=\"og:description\" content=\"Deploy a unified RAG architecture entirely on PostgreSQL with pgvector and Apache AGE. This starter template uses relational graphs to recover technical dependencies &quot;lost in the middle&quot; of long contexts, improving grounding and verifiability.\" \/>\n<meta property=\"og:url\" content=\"https:\/\/datascientists.info\/index.php\/2026\/06\/19\/beating-lost-in-the-middle-unified-graph-rag-on-postgresql\/\" \/>\n<meta property=\"og:site_name\" content=\"DATA DO - \u30c7\u30fc\u30bf \u9053\" \/>\n<meta property=\"article:publisher\" content=\"https:\/\/www.facebook.com\/DataScientists\/\" \/>\n<meta property=\"article:published_time\" content=\"2026-06-19T10:16:47+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2026-06-19T10:16:48+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/datascientists.info\/wp-content\/uploads\/2026\/06\/image.png\" \/>\n<meta name=\"author\" content=\"Marc Matt, saidah\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"Marc Matt\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"4 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\\\/\\\/schema.org\",\"@graph\":[{\"@type\":\"Article\",\"@id\":\"https:\\\/\\\/datascientists.info\\\/index.php\\\/2026\\\/06\\\/19\\\/beating-lost-in-the-middle-unified-graph-rag-on-postgresql\\\/#article\",\"isPartOf\":{\"@id\":\"https:\\\/\\\/datascientists.info\\\/index.php\\\/2026\\\/06\\\/19\\\/beating-lost-in-the-middle-unified-graph-rag-on-postgresql\\\/\"},\"author\":{\"name\":\"Marc Matt\",\"@id\":\"https:\\\/\\\/datascientists.info\\\/#\\\/schema\\\/person\\\/723078870bf3135121086d46ebb12f19\"},\"headline\":\"Beating &#8220;Lost in the Middle&#8221;: Unified Graph RAG on PostgreSQL\",\"datePublished\":\"2026-06-19T10:16:47+00:00\",\"dateModified\":\"2026-06-19T10:16:48+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\\\/\\\/datascientists.info\\\/index.php\\\/2026\\\/06\\\/19\\\/beating-lost-in-the-middle-unified-graph-rag-on-postgresql\\\/\"},\"wordCount\":720,\"publisher\":{\"@id\":\"https:\\\/\\\/datascientists.info\\\/#organization\"},\"image\":{\"@id\":\"https:\\\/\\\/datascientists.info\\\/index.php\\\/2026\\\/06\\\/19\\\/beating-lost-in-the-middle-unified-graph-rag-on-postgresql\\\/#primaryimage\"},\"thumbnailUrl\":\"https:\\\/\\\/datascientists.info\\\/wp-content\\\/uploads\\\/2026\\\/06\\\/image.png\",\"keywords\":[\"Data Engineering\",\"GenAI\",\"RAG\"],\"articleSection\":[\"Data Warehouse\",\"Generative AI\"],\"inLanguage\":\"en-US\"},{\"@type\":\"WebPage\",\"@id\":\"https:\\\/\\\/datascientists.info\\\/index.php\\\/2026\\\/06\\\/19\\\/beating-lost-in-the-middle-unified-graph-rag-on-postgresql\\\/\",\"url\":\"https:\\\/\\\/datascientists.info\\\/index.php\\\/2026\\\/06\\\/19\\\/beating-lost-in-the-middle-unified-graph-rag-on-postgresql\\\/\",\"name\":\"Beating \\\"Lost in the Middle\\\": Unified Graph RAG on PostgreSQL - DATA DO - \u30c7\u30fc\u30bf \u9053\",\"isPartOf\":{\"@id\":\"https:\\\/\\\/datascientists.info\\\/#website\"},\"primaryImageOfPage\":{\"@id\":\"https:\\\/\\\/datascientists.info\\\/index.php\\\/2026\\\/06\\\/19\\\/beating-lost-in-the-middle-unified-graph-rag-on-postgresql\\\/#primaryimage\"},\"image\":{\"@id\":\"https:\\\/\\\/datascientists.info\\\/index.php\\\/2026\\\/06\\\/19\\\/beating-lost-in-the-middle-unified-graph-rag-on-postgresql\\\/#primaryimage\"},\"thumbnailUrl\":\"https:\\\/\\\/datascientists.info\\\/wp-content\\\/uploads\\\/2026\\\/06\\\/image.png\",\"datePublished\":\"2026-06-19T10:16:47+00:00\",\"dateModified\":\"2026-06-19T10:16:48+00:00\",\"description\":\"Deploy a unified RAG architecture entirely on PostgreSQL with pgvector and Apache AGE. This starter template uses relational graphs to recover technical dependencies \\\"lost in the middle\\\" of long contexts, improving grounding and verifiability.\",\"breadcrumb\":{\"@id\":\"https:\\\/\\\/datascientists.info\\\/index.php\\\/2026\\\/06\\\/19\\\/beating-lost-in-the-middle-unified-graph-rag-on-postgresql\\\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\\\/\\\/datascientists.info\\\/index.php\\\/2026\\\/06\\\/19\\\/beating-lost-in-the-middle-unified-graph-rag-on-postgresql\\\/\"]}]},{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\\\/\\\/datascientists.info\\\/index.php\\\/2026\\\/06\\\/19\\\/beating-lost-in-the-middle-unified-graph-rag-on-postgresql\\\/#primaryimage\",\"url\":\"https:\\\/\\\/datascientists.info\\\/wp-content\\\/uploads\\\/2026\\\/06\\\/image.png\",\"contentUrl\":\"https:\\\/\\\/datascientists.info\\\/wp-content\\\/uploads\\\/2026\\\/06\\\/image.png\",\"width\":1024,\"height\":572},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\\\/\\\/datascientists.info\\\/index.php\\\/2026\\\/06\\\/19\\\/beating-lost-in-the-middle-unified-graph-rag-on-postgresql\\\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\\\/\\\/datascientists.info\\\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"Beating &#8220;Lost in the Middle&#8221;: Unified Graph RAG on PostgreSQL\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\\\/\\\/datascientists.info\\\/#website\",\"url\":\"https:\\\/\\\/datascientists.info\\\/\",\"name\":\"Data Scientists\",\"description\":\"Digging data, Big Data, Analysis, Data Mining\",\"publisher\":{\"@id\":\"https:\\\/\\\/datascientists.info\\\/#organization\"},\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\\\/\\\/datascientists.info\\\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"en-US\"},{\"@type\":\"Organization\",\"@id\":\"https:\\\/\\\/datascientists.info\\\/#organization\",\"name\":\"DATA DO - \u30c7\u30fc\u30bf \u9053\",\"url\":\"https:\\\/\\\/datascientists.info\\\/\",\"logo\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\\\/\\\/datascientists.info\\\/#\\\/schema\\\/logo\\\/image\\\/\",\"url\":\"https:\\\/\\\/datascientists.info\\\/wp-content\\\/uploads\\\/2026\\\/02\\\/Bildschirmfoto-vom-2026-02-02-08-13-21.png\",\"contentUrl\":\"https:\\\/\\\/datascientists.info\\\/wp-content\\\/uploads\\\/2026\\\/02\\\/Bildschirmfoto-vom-2026-02-02-08-13-21.png\",\"width\":250,\"height\":174,\"caption\":\"DATA DO - \u30c7\u30fc\u30bf \u9053\"},\"image\":{\"@id\":\"https:\\\/\\\/datascientists.info\\\/#\\\/schema\\\/logo\\\/image\\\/\"},\"sameAs\":[\"https:\\\/\\\/www.facebook.com\\\/DataScientists\\\/\"]},{\"@type\":\"Person\",\"@id\":\"https:\\\/\\\/datascientists.info\\\/#\\\/schema\\\/person\\\/723078870bf3135121086d46ebb12f19\",\"name\":\"Marc Matt\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\\\/\\\/secure.gravatar.com\\\/avatar\\\/74f48ef754cf04f628f42ed117a3f2b42931feeb41a3cca2313b9714a7d4fdd2?s=96&d=mm&r=g53b84b5f47a2156ba8b047d71d6d05fc\",\"url\":\"https:\\\/\\\/secure.gravatar.com\\\/avatar\\\/74f48ef754cf04f628f42ed117a3f2b42931feeb41a3cca2313b9714a7d4fdd2?s=96&d=mm&r=g\",\"contentUrl\":\"https:\\\/\\\/secure.gravatar.com\\\/avatar\\\/74f48ef754cf04f628f42ed117a3f2b42931feeb41a3cca2313b9714a7d4fdd2?s=96&d=mm&r=g\",\"caption\":\"Marc Matt\"},\"description\":\"Senior Data Architect with 15+ years of experience helping Hamburg's leading enterprises modernize their data infrastructure. I bridge the gap between legacy systems (SAP, Hadoop) and modern AI capabilities. I help clients: Migrate &amp; Modernize: Transitioning on-premise data warehouses to Google Cloud\\\/AWS to reduce costs and increase agility. Implement GenAI: Building secure RAG (Retrieval-Augmented Generation) pipelines to unlock value from internal knowledge bases using LangChain and Vector DBs. Scale MLOps: Operationalizing machine learning models from PoC to production with Kubernetes and Airflow. Proven track record leading engineering teams.\",\"sameAs\":[\"https:\\\/\\\/data-do.de\"]}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"Beating \"Lost in the Middle\": Unified Graph RAG on PostgreSQL - DATA DO - \u30c7\u30fc\u30bf \u9053","description":"Deploy a unified RAG architecture entirely on PostgreSQL with pgvector and Apache AGE. This starter template uses relational graphs to recover technical dependencies \"lost in the middle\" of long contexts, improving grounding and verifiability.","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/datascientists.info\/index.php\/2026\/06\/19\/beating-lost-in-the-middle-unified-graph-rag-on-postgresql\/","og_locale":"en_US","og_type":"article","og_title":"Beating \"Lost in the Middle\": Unified Graph RAG on PostgreSQL - DATA DO - \u30c7\u30fc\u30bf \u9053","og_description":"Deploy a unified RAG architecture entirely on PostgreSQL with pgvector and Apache AGE. This starter template uses relational graphs to recover technical dependencies \"lost in the middle\" of long contexts, improving grounding and verifiability.","og_url":"https:\/\/datascientists.info\/index.php\/2026\/06\/19\/beating-lost-in-the-middle-unified-graph-rag-on-postgresql\/","og_site_name":"DATA DO - \u30c7\u30fc\u30bf \u9053","article_publisher":"https:\/\/www.facebook.com\/DataScientists\/","article_published_time":"2026-06-19T10:16:47+00:00","article_modified_time":"2026-06-19T10:16:48+00:00","og_image":[{"url":"https:\/\/datascientists.info\/wp-content\/uploads\/2026\/06\/image.png","type":"","width":"","height":""}],"author":"Marc Matt, saidah","twitter_card":"summary_large_image","twitter_misc":{"Written by":"Marc Matt","Est. reading time":"4 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/datascientists.info\/index.php\/2026\/06\/19\/beating-lost-in-the-middle-unified-graph-rag-on-postgresql\/#article","isPartOf":{"@id":"https:\/\/datascientists.info\/index.php\/2026\/06\/19\/beating-lost-in-the-middle-unified-graph-rag-on-postgresql\/"},"author":{"name":"Marc Matt","@id":"https:\/\/datascientists.info\/#\/schema\/person\/723078870bf3135121086d46ebb12f19"},"headline":"Beating &#8220;Lost in the Middle&#8221;: Unified Graph RAG on PostgreSQL","datePublished":"2026-06-19T10:16:47+00:00","dateModified":"2026-06-19T10:16:48+00:00","mainEntityOfPage":{"@id":"https:\/\/datascientists.info\/index.php\/2026\/06\/19\/beating-lost-in-the-middle-unified-graph-rag-on-postgresql\/"},"wordCount":720,"publisher":{"@id":"https:\/\/datascientists.info\/#organization"},"image":{"@id":"https:\/\/datascientists.info\/index.php\/2026\/06\/19\/beating-lost-in-the-middle-unified-graph-rag-on-postgresql\/#primaryimage"},"thumbnailUrl":"https:\/\/datascientists.info\/wp-content\/uploads\/2026\/06\/image.png","keywords":["Data Engineering","GenAI","RAG"],"articleSection":["Data Warehouse","Generative AI"],"inLanguage":"en-US"},{"@type":"WebPage","@id":"https:\/\/datascientists.info\/index.php\/2026\/06\/19\/beating-lost-in-the-middle-unified-graph-rag-on-postgresql\/","url":"https:\/\/datascientists.info\/index.php\/2026\/06\/19\/beating-lost-in-the-middle-unified-graph-rag-on-postgresql\/","name":"Beating \"Lost in the Middle\": Unified Graph RAG on PostgreSQL - DATA DO - \u30c7\u30fc\u30bf \u9053","isPartOf":{"@id":"https:\/\/datascientists.info\/#website"},"primaryImageOfPage":{"@id":"https:\/\/datascientists.info\/index.php\/2026\/06\/19\/beating-lost-in-the-middle-unified-graph-rag-on-postgresql\/#primaryimage"},"image":{"@id":"https:\/\/datascientists.info\/index.php\/2026\/06\/19\/beating-lost-in-the-middle-unified-graph-rag-on-postgresql\/#primaryimage"},"thumbnailUrl":"https:\/\/datascientists.info\/wp-content\/uploads\/2026\/06\/image.png","datePublished":"2026-06-19T10:16:47+00:00","dateModified":"2026-06-19T10:16:48+00:00","description":"Deploy a unified RAG architecture entirely on PostgreSQL with pgvector and Apache AGE. This starter template uses relational graphs to recover technical dependencies \"lost in the middle\" of long contexts, improving grounding and verifiability.","breadcrumb":{"@id":"https:\/\/datascientists.info\/index.php\/2026\/06\/19\/beating-lost-in-the-middle-unified-graph-rag-on-postgresql\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/datascientists.info\/index.php\/2026\/06\/19\/beating-lost-in-the-middle-unified-graph-rag-on-postgresql\/"]}]},{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/datascientists.info\/index.php\/2026\/06\/19\/beating-lost-in-the-middle-unified-graph-rag-on-postgresql\/#primaryimage","url":"https:\/\/datascientists.info\/wp-content\/uploads\/2026\/06\/image.png","contentUrl":"https:\/\/datascientists.info\/wp-content\/uploads\/2026\/06\/image.png","width":1024,"height":572},{"@type":"BreadcrumbList","@id":"https:\/\/datascientists.info\/index.php\/2026\/06\/19\/beating-lost-in-the-middle-unified-graph-rag-on-postgresql\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/datascientists.info\/"},{"@type":"ListItem","position":2,"name":"Beating &#8220;Lost in the Middle&#8221;: Unified Graph RAG on PostgreSQL"}]},{"@type":"WebSite","@id":"https:\/\/datascientists.info\/#website","url":"https:\/\/datascientists.info\/","name":"Data Scientists","description":"Digging data, Big Data, Analysis, Data Mining","publisher":{"@id":"https:\/\/datascientists.info\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/datascientists.info\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"},{"@type":"Organization","@id":"https:\/\/datascientists.info\/#organization","name":"DATA DO - \u30c7\u30fc\u30bf \u9053","url":"https:\/\/datascientists.info\/","logo":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/datascientists.info\/#\/schema\/logo\/image\/","url":"https:\/\/datascientists.info\/wp-content\/uploads\/2026\/02\/Bildschirmfoto-vom-2026-02-02-08-13-21.png","contentUrl":"https:\/\/datascientists.info\/wp-content\/uploads\/2026\/02\/Bildschirmfoto-vom-2026-02-02-08-13-21.png","width":250,"height":174,"caption":"DATA DO - \u30c7\u30fc\u30bf \u9053"},"image":{"@id":"https:\/\/datascientists.info\/#\/schema\/logo\/image\/"},"sameAs":["https:\/\/www.facebook.com\/DataScientists\/"]},{"@type":"Person","@id":"https:\/\/datascientists.info\/#\/schema\/person\/723078870bf3135121086d46ebb12f19","name":"Marc Matt","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/secure.gravatar.com\/avatar\/74f48ef754cf04f628f42ed117a3f2b42931feeb41a3cca2313b9714a7d4fdd2?s=96&d=mm&r=g53b84b5f47a2156ba8b047d71d6d05fc","url":"https:\/\/secure.gravatar.com\/avatar\/74f48ef754cf04f628f42ed117a3f2b42931feeb41a3cca2313b9714a7d4fdd2?s=96&d=mm&r=g","contentUrl":"https:\/\/secure.gravatar.com\/avatar\/74f48ef754cf04f628f42ed117a3f2b42931feeb41a3cca2313b9714a7d4fdd2?s=96&d=mm&r=g","caption":"Marc Matt"},"description":"Senior Data Architect with 15+ years of experience helping Hamburg's leading enterprises modernize their data infrastructure. I bridge the gap between legacy systems (SAP, Hadoop) and modern AI capabilities. I help clients: Migrate &amp; Modernize: Transitioning on-premise data warehouses to Google Cloud\/AWS to reduce costs and increase agility. Implement GenAI: Building secure RAG (Retrieval-Augmented Generation) pipelines to unlock value from internal knowledge bases using LangChain and Vector DBs. Scale MLOps: Operationalizing machine learning models from PoC to production with Kubernetes and Airflow. Proven track record leading engineering teams.","sameAs":["https:\/\/data-do.de"]}]}},"authors":[{"term_id":144,"user_id":1,"is_guest":0,"slug":"marc","display_name":"Marc Matt","avatar_url":"https:\/\/secure.gravatar.com\/avatar\/74f48ef754cf04f628f42ed117a3f2b42931feeb41a3cca2313b9714a7d4fdd2?s=96&d=mm&r=g","author_category":"1","first_name":"Marc","last_name":"Matt","user_url":"https:\/\/data-do.de","job_title":"Senior Data Architect | GenAI & RAG Expert | GCP \/ AWS","description":"Senior Data Architect with 15+ years of experience helping Hamburg's leading enterprises modernize their data infrastructure. I bridge the gap between legacy systems (SAP, Hadoop) and modern AI capabilities.\r\n\r\nI help clients:\r\n\r\n \tMigrate &amp; Modernize: Transitioning on-premise data warehouses to Google Cloud\/AWS to reduce costs and increase agility.\r\n\r\n\r\n \tImplement GenAI: Building secure RAG (Retrieval-Augmented Generation) pipelines to unlock value from internal knowledge bases using LangChain and Vector DBs.\r\n \tScale MLOps: Operationalizing machine learning models from PoC to production with Kubernetes and Airflow.\r\n\r\nProven track record leading engineering teams."},{"term_id":145,"user_id":2,"is_guest":0,"slug":"saidah","display_name":"saidah","avatar_url":"https:\/\/secure.gravatar.com\/avatar\/015737c94dd80772d772f2b24a55e96c868068f28684c8577d9492f3313e4dd3?s=96&d=mm&r=g","author_category":"","first_name":"Saidah","last_name":"","user_url":"http:\/\/data-do.de","job_title":"","description":""}],"_links":{"self":[{"href":"https:\/\/datascientists.info\/index.php\/wp-json\/wp\/v2\/posts\/819","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/datascientists.info\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/datascientists.info\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/datascientists.info\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/datascientists.info\/index.php\/wp-json\/wp\/v2\/comments?post=819"}],"version-history":[{"count":3,"href":"https:\/\/datascientists.info\/index.php\/wp-json\/wp\/v2\/posts\/819\/revisions"}],"predecessor-version":[{"id":824,"href":"https:\/\/datascientists.info\/index.php\/wp-json\/wp\/v2\/posts\/819\/revisions\/824"}],"wp:attachment":[{"href":"https:\/\/datascientists.info\/index.php\/wp-json\/wp\/v2\/media?parent=819"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/datascientists.info\/index.php\/wp-json\/wp\/v2\/categories?post=819"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/datascientists.info\/index.php\/wp-json\/wp\/v2\/tags?post=819"},{"taxonomy":"author","embeddable":true,"href":"https:\/\/datascientists.info\/index.php\/wp-json\/wp\/v2\/ppma_author?post=819"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}