Pinecone Just Demoted Vector Search. Here's the Knowledge Layer.
Most important take away
The “memory wars” are on: even vector database leader Pinecone is admitting vector search alone is insufficient for agentic work, while SAP, Google, Cloudflare, and Microsoft are pouring billions into knowledge layers that respect different data shapes (prose, structured documents, tables, graphs). If you’re building agents, do not pick a database first; first define the retrieval contract and the exact “bundle” of context your agent needs to do its job, then choose primitives to deliver that bundle.
Summary
Actionable insights for builders and operators:
- Stop treating vector search as the default answer. Classic RAG was designed for chatbot Q&A; agents run multi-step tasks and need assembled “operating context,” not just three semantically similar chunks. Pinecone itself is now shipping Nexus and a query language (NoQL) that carries intent, filters, access policy, provenance, response shape, confidence, and budget, not just similarity.
- Match the retrieval unit to the work. A chunk works for FAQs, a section for financial filings, a table for financial analysis, a customer record for support, a graph neighborhood for dependency reasoning, and a compiled brief for repeated workflows. Page Index’s tree approach (no embeddings, hierarchical document trees) hits 98.7% on FinanceBench precisely because it preserves structure.
- Recognize the four shapes of enterprise knowledge: fuzzy prose, long structured documents, tabular business data, and relationships/graphs. SAP’s >1B euro bet (Dreamy-O lakehouse plus Prior Labs’ tabular foundation model TabPFN) signals that most enterprise truth lives in governed tables, not PDFs — flattening tables to text loses meaning. Microsoft’s GraphRAG addresses the relational shape.
- Bigger context windows are not a fix. Chroma’s “context rot” research shows model performance degrades as context grows and clutters. You need appropriate context, not maximum context — with provenance, authority, freshness, and permissions marked.
- The three-step playbook for building an agent today:
- Don’t pick a database first. Pick the retrieval contract: what must the agent receive, in what form, to do its job reliably?
- Write down the bundle. For a refund agent: customer record, plan, region, product version, purchase history, refund policy, threshold, prior exceptions, current ticket, approved response language, and authorization scope — each field forces choices about source, governance, freshness, and fallback.
- Choose primitives that deliver the bundle. Vector search + document trees for prose, semantic layer + tabular models for governed business data, graphs for relational reasoning. Most real agents need a mix.
- Watch failure modes: compiled bundles go stale, graphs encode bad relationships, semantic layers become political fights over “source of truth,” and agents can promote their own prior inferences into “confirmed facts” that quietly degrade future runs. Also avoid overbuilding — a help-center bot doesn’t need GraphRAG plus document trees plus a semantic layer.
- Career and strategy angle: the engineers and teams who win here will be the ones who think about what their agent actually needs before going on a vendor shopping spree. Mine your own agent run logs — count retrieval calls before useful work starts, repeated source opens, token spend on raw context, redundant user questions, and rediscovery between runs. That telemetry tells you which memory primitives you actually need.
Chapter Summaries
- Intro — The memory problem: Pinecone, SAP, Google, Cloudflare, and Microsoft are all racing to fix agent memory; rediscovery can eat up to 85% of agent compute.
- Definitions: RAG is just a retrieval loop; vector search is one common kind but not the only kind. No silver bullet — expect to combine multiple retrieval types.
- Why chatbot-era RAG fails agents: agents do work (open tickets, cross-reference contracts), not just Q&A. They need assembled bundles, not three relevant chunks.
- Pinecone’s Nexus + NoQL: retrieval interface should carry intent, filters, access policy, provenance, response shape, confidence, and budget — more than similarity.
- Page Index: hierarchical document trees preserve structure; chunking loses meaning in filings and contracts. 98.7% on FinanceBench without embeddings. Principle: retrieval unit must match the work.
- SAP’s bets — Dreamy-O and Prior Labs: lakehouse + semantic layer + governed tabular data plus tabular foundation models (TabPFN). Enterprise truth lives in tables, not PDFs.
- Microsoft GraphRAG: some knowledge is inherently relational; chunks and tables don’t carry it.
- Four shapes of knowledge: prose, structured documents, tables, graphs. The real choice is which shapes your agent needs.
- Context windows aren’t the fix: Chroma’s context rot research — appropriate context beats maximum context.
- Three-step build playbook: contract first, then bundle, then primitives.
- Failure modes and overbuilding warnings; mine your own agent logs to discover what you need.
- Closing: the winners will be teams who think before shopping; deeper resources (retrieval contract checklist, worked bundles for support/legal/finance/code review) on Substack.