Name: Cortrix
Author: Cortrix

From Source Material to Semantic Records

Cortrix turns documents, code, workflow events, and retrieval evidence into semantic records built for agentic systems.

Cortrix Architecture — Agent-native semantic storage engine with SPC pipeline, hybrid query engine with reranker, memory, storage layer, and connector paths

The diagram is a system overview of the semantic storage layer: ingestion, hybrid retrieval, memory, source context, and agent-facing connector paths.

Docker or PG Extension Path

Choose docker-compose, standalone server, or pgcortrix PostgreSQL extension paths based on the stack you already operate.

Feedback Signals for Review

Observability and feedback signals help teams inspect what supported an agent answer. Automatic learning from those signals is a coming roadmap topic.

Scoped Connector Paths

Designed to connect with MCP, CLI workflows, HTTP upload, filesystem watchers, and framework adapters as each public integration matures.

Defining a New Category

Agent data workflows need more than isolated retrieval calls. They need a storage layer that can keep semantic context, memory, and traceability together.

Agent-Native

Designed around agent data workflows: semantic processing, memory, and traceability are treated as shared storage concerns rather than scattered integration code.

Semantic Storage

Documents are parsed, chunked, embedded, and indexed in a shared semantic layer so agent workflows can query meaning and source context together.

Built to Fit Existing Stacks

Designed to work with existing databases, tools, and agent workflows while giving retrieval, memory, and audit a shared semantic layer.

One Engine. Agent Data Foundations.

Cortrix organizes semantic ingestion, hybrid retrieval with reranking, interaction memory, and source-level context into a shared engine design for REST, MCP, SDK, and PostgreSQL-backed workflows.

Semantic Processing Chain (SPC)

Prepare common document types such as PDF, Word, and Markdown through the v1.0 SPC path: Docling parsing with OCR fallback, parent-child chunking, NER + summary enrichment, embedding, and indexing under launch verification.

Core

Hybrid Query + Reranker

Vector similarity (P-HNSW with WAL persistence) and BM25 keyword search are fused via RRF, then precision-reranked with bge-reranker-v2-m3 for agent-facing retrieval.

Core

AI Interaction Memory

Persistent conversation memory gives agent workflows session management, LLM-based fact extraction, typed memory, decay, per-user isolation, transparency APIs, and inspectable audit context.

Core

Agent Observability

Session, trace, and agent headers help teams inspect what happened inside an agent workflow. Retrieval feedback learning remains a coming roadmap area rather than an automatic self-learning claim.

Core

Source-Level Traceability

Cortrix targets inspectable provenance beyond call indexes. Answers can link back to contributing chunks, documents, and conversation turns through citation tracing and retrieval attribution.

Core

Namespaces + Cross-NS Query

Use namespaces to isolate data by project, team, or tenant. Cross-namespace query keeps scatter-gather and unified ranking semantics explicit across agent workflows.

Core

Retrieval Quality Foundations

Parent-child chunking is part of the release quality foundation. Contextual retrieval, multi-query patterns, retrieval grading, and hypothetical-question indexing remain coming evaluation paths.

Core

pgCortrix Extension

For PostgreSQL-backed applications, pgCortrix is the path for bringing semantic storage closer to existing PostgreSQL infrastructure while preserving the option to run Cortrix as a separate service.

Integration

MCP Server

The Cortrix MCP server exposes 29 core tools plus 2 admin tools over stdio, so IDE agents can search, upload, manage memory, check task status, and trigger controlled admin imports.

Integration

Local Embedding Path (bge-m3)

ONNX Runtime integration with bge-m3 is the local embedding path for teams that want semantic search without making every query depend on an external embedding service.

Local

Connector Paths

Cortrix meets agent tools through MCP, REST, Python SDK, framework adapters, HTTP upload, directory watch, DB import, and PostgreSQL extension paths.

Paths

Python SDK + REST + Web UI

pip install cortrix is the Python client path, alongside HTTP APIs and a local dashboard for document management, search, and AI chat.

Interface

Run the Local Path

Start the server, verify readiness, then query the demo namespace through HTTP or the Python SDK.

1

Start Cortrix

Terminal

docker run -d -p 8420:8420 \
  -v cortrix-data:/data \
  cortrix/cortrix-demo:v1.0

2

Check readiness

Terminal

curl http://localhost:8420/api/v1/system/health/ready

3

Semantic search

Terminal

curl -X POST http://localhost:8420/api/v1/query \
  -H "Content-Type: application/json" \
  -d '{"namespaces": ["demo"], "query": "What is Cortrix?", "top_k": 10}'

Or use the Python SDK:

Python

pip install cortrix

from cortrix import Cortrix

client = Cortrix(base_url="http://localhost:8420")
results = client.search("demo", "What is Cortrix?", top_k=10)

Built for Real Agent Data Workflows

Cortrix focuses on the shared data layer behind agent workflows: documents, retrieval context, memory, feedback signals, and traceability that are often stitched together with custom glue code.

Agent Workflow Storage

Agent workflows need persistent semantic memory and inspectable source context. Cortrix gives builders a shared storage foundation to evaluate those public release paths.

Reviewable Agent Context

Source-level traceability helps teams inspect which documents, chunks, and turns contributed to an answer or workflow step.

Workflow Orchestration

Use Cortrix as a shared semantic layer that MCP, REST, Python SDK, and framework-based workflows can call without each agent owning a separate retrieval stack.

Developer Knowledge Workflows

Use Cortrix as a semantic layer around documents, workflow context, and agent memory while keeping existing databases and tools in place.

How Cortrix Compares

A scope-aware view of a shared semantic storage layer versus maintaining many glue-code paths.

Capability	Cortrix	Typical Glue-Code Stack
Document Ingestion	Current OSS SPC pipeline foundation	Often handled through separate parser and chunker components
Embedding	Current OSS local embedding path (bge-m3, ONNX)	Often handled through an external model or service
Vector Search	Current OSS P-HNSW in-process path	Often delegated to a separate vector database
Keyword Search	Current OSS FTS5 + BM25 path	May require a separate search component
Hybrid Fusion	Current OSS RRF fusion path	Often implemented as custom orchestration code
Reranker	Current OSS bge-reranker-v2-m3 path	May require an external service or custom step
Cross-Namespace Query	Current OSS scatter-gather + unified ranking path	Often handled in client-side orchestration
Advanced RAG	Coming next retrieval-quality patterns by published scope	Often handled by framework plugins or custom pipelines
AI Memory	Current OSS typed memory and extraction path	Often handled as a separate memory service or custom store
Source-Level Traceability	Current OSS content-level citation and attribution context	Scope varies by stack and tracing implementation
Agent Observability	Current OSS session, trace, and feedback signals	Often requires separate instrumentation
Retrieval Feedback Learning	Roadmap automatic learning from feedback signals	Usually implemented as a custom feedback loop
PostgreSQL Integration	Current OSS pgCortrix extension path	Often runs as a separate service
Workflow Connectors	Current OSS MCP, REST, Python SDK, and framework adapter paths	Usually framework-specific
MCP Server	Current OSS MCP server path	Availability depends on the chosen stack
Scope Boundary	Current OSS Current OSS / Coming next / Roadmap labels	Usually spread across separate docs
Deployment	Docker / server / PG extension paths	Often multiple services

Join the Community

Cortrix is open source and community-driven. We welcome feedback, docs fixes, implementation discussion, and reproducibility work from builders.

For Builders Benchmark Methodology Agent-readable Docs