ContextFS: A Distributed Type-Safe Memory System for AI
Enabling persistent, structured knowledge across AI tools and sessions through a formally-specified type system with 22 memory categories and sub-50ms query latency.
ContextFS: A Distributed Type-Safe Memory System for AI
Enabling Persistent, Structured Knowledge Across AI Tools and Sessions
Download: PDF | LaTeX Source
Abstract
We present ContextFS, a novel distributed, type-safe memory system designed for artificial intelligence applications. As AI assistants become increasingly integrated into software development workflows, the ephemeral nature of their context windows poses significant challenges for maintaining coherent, long-term knowledge.
ContextFS addresses this limitation through a unified memory layer that persists across tools, repositories, and sessions while enforcing type safety through a formal grammar based on dependent type theory.
The Memory Problem
Consider a software engineer working with an AI assistant on a large codebase. In the morning session, the assistant learns:
- The project uses JWT tokens with RS256 signing for authentication
- Database connections should be pooled with a maximum of 20 connections
- The team prefers functional programming patterns over object-oriented approaches
By afternoon, after a session restart, all of this knowledge is lost.
This illustrates three fundamental challenges:
- Temporal Discontinuity: Knowledge doesn't persist across sessions
- Cross-Tool Fragmentation: Different AI tools maintain separate context stores
- Structural Ambiguity: Untyped memories lack semantic categorization
Type System
ContextFS implements a comprehensive type system with 22 memory categories:
Core Types
| Type | Description | Structured Data |
|---|---|---|
FACT | Static facts, configurations | Optional |
DECISION | Architectural decisions | {decision, rationale, alternatives[]} |
PROCEDURAL | Step-by-step workflows | {steps[], prerequisites[]} |
ERROR | Runtime errors, solutions | {error_type, message, resolution} |
CODE | Code snippets, patterns | Optional |
Extended Types
| Type | Description |
|---|---|
API | Endpoint definitions |
SCHEMA | Data models |
TEST | Test cases |
CONFIG | Environment configs |
WORKFLOW | Multi-step workflows |
AGENT_RUN | LLM execution records |
Formal Specification
The type system is formalized using dependent type theory:
Memory := (id: UUID, content: String, type: MemoryType,
structured_data: Schema(type), embedding: Vector[384])
Schema : MemoryType → Type
Schema(DECISION) = {decision: String, rationale: String, alternatives: List[String]}
Schema(ERROR) = {error_type: String, message: String, resolution: String}
Schema(PROCEDURAL) = {steps: List[String], prerequisites: Option[List[String]]}
Architecture
┌─────────────────────────────────────────────────────────────┐
│ AI Tools Layer │
│ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐ │
│ │ Claude │ │ GPT │ │ Gemini │ │ Cursor │ │
│ │ Code │ │ Chat │ │ CLI │ │ IDE │ │
│ └────┬─────┘ └────┬─────┘ └────┬─────┘ └────┬─────┘ │
│ └─────────────┴─────────────┴─────────────┘ │
│ │ │
│ MCP Protocol │
│ │ │
│ ┌─────────────────┴─────────────────┐ │
│ │ ContextFS Core │ │
│ │ ┌──────────┐ ┌──────────────┐ │ │
│ │ │Type │ │Hybrid Search │ │ │
│ │ │Validation│ │(Vector+BM25) │ │ │
│ │ └──────────┘ └──────────────┘ │ │
│ └─────────────────┬─────────────────┘ │
│ │ │
│ ┌─────────────────┴─────────────────┐ │
│ │ Storage Layer │ │
│ │ ┌──────────┐ ┌──────────────┐ │ │
│ │ │ SQLite │ │ ChromaDB │ │ │
│ │ │ (Schema) │ │ (Embeddings) │ │ │
│ │ └──────────┘ └──────────────┘ │ │
│ └───────────────────────────────────┘ │
└─────────────────────────────────────────────────────────────┘
Performance
Evaluation across real-world codebases demonstrates:
| Metric | Result |
|---|---|
| Query Latency | < 50ms |
| Collection Size | > 10,000 memories |
| Embedding Generation | Local (no API) |
| Type Validation | < 1ms |
Hybrid Search
ContextFS combines semantic and keyword search:
def hybrid_search(query: str, limit: int = 10):
# Semantic search via embeddings
semantic_results = chromadb.query(
query_embeddings=encode(query),
n_results=limit * 2
)
# BM25 keyword search
keyword_results = sqlite.fts_search(query, limit * 2)
# Reciprocal Rank Fusion
return merge_results(semantic_results, keyword_results, limit)
Conclusion
ContextFS represents a foundational step toward giving AI systems persistent, structured memory capabilities that mirror human cognitive patterns while maintaining the rigor of formal type systems.
Citation:
@article{long2026contextfs,
title={ContextFS: A Distributed Type-Safe Memory System for AI},
author={Long, Matthew},
journal={YonedaAI Research},
year={2026}
}
Download: PDF | LaTeX Source