ContextFS: A Distributed Type-Safe Memory System for AI

Enabling Persistent, Structured Knowledge Across AI Tools and Sessions

Download: PDF | LaTeX Source

Abstract

We present ContextFS, a novel distributed, type-safe memory system designed for artificial intelligence applications. As AI assistants become increasingly integrated into software development workflows, the ephemeral nature of their context windows poses significant challenges for maintaining coherent, long-term knowledge.

ContextFS addresses this limitation through a unified memory layer that persists across tools, repositories, and sessions while enforcing type safety through a formal grammar based on dependent type theory.

The Memory Problem

Consider a software engineer working with an AI assistant on a large codebase. In the morning session, the assistant learns:

The project uses JWT tokens with RS256 signing for authentication
Database connections should be pooled with a maximum of 20 connections
The team prefers functional programming patterns over object-oriented approaches

By afternoon, after a session restart, all of this knowledge is lost.

This illustrates three fundamental challenges:

Temporal Discontinuity: Knowledge doesn't persist across sessions
Cross-Tool Fragmentation: Different AI tools maintain separate context stores
Structural Ambiguity: Untyped memories lack semantic categorization

Type System

ContextFS implements a comprehensive type system with 22 memory categories:

Core Types

Type	Description	Structured Data
`FACT`	Static facts, configurations	Optional
`DECISION`	Architectural decisions	`{decision, rationale, alternatives[]}`
`PROCEDURAL`	Step-by-step workflows	`{steps[], prerequisites[]}`
`ERROR`	Runtime errors, solutions	`{error_type, message, resolution}`
`CODE`	Code snippets, patterns	Optional

Extended Types

Type	Description
`API`	Endpoint definitions
`SCHEMA`	Data models
`TEST`	Test cases
`CONFIG`	Environment configs
`WORKFLOW`	Multi-step workflows
`AGENT_RUN`	LLM execution records

Formal Specification

The type system is formalized using dependent type theory:

Memory := (id: UUID, content: String, type: MemoryType,
           structured_data: Schema(type), embedding: Vector[384])

Schema : MemoryType → Type
Schema(DECISION) = {decision: String, rationale: String, alternatives: List[String]}
Schema(ERROR) = {error_type: String, message: String, resolution: String}
Schema(PROCEDURAL) = {steps: List[String], prerequisites: Option[List[String]]}

Architecture

┌─────────────────────────────────────────────────────────────┐
│                     AI Tools Layer                          │
│  ┌──────────┐  ┌──────────┐  ┌──────────┐  ┌──────────┐   │
│  │  Claude  │  │   GPT    │  │  Gemini  │  │  Cursor  │   │
│  │   Code   │  │   Chat   │  │   CLI    │  │   IDE    │   │
│  └────┬─────┘  └────┬─────┘  └────┬─────┘  └────┬─────┘   │
│       └─────────────┴─────────────┴─────────────┘          │
│                         │                                   │
│                    MCP Protocol                            │
│                         │                                   │
│       ┌─────────────────┴─────────────────┐                │
│       │         ContextFS Core            │                │
│       │  ┌──────────┐  ┌──────────────┐  │                │
│       │  │Type      │  │Hybrid Search │  │                │
│       │  │Validation│  │(Vector+BM25) │  │                │
│       │  └──────────┘  └──────────────┘  │                │
│       └─────────────────┬─────────────────┘                │
│                         │                                   │
│       ┌─────────────────┴─────────────────┐                │
│       │           Storage Layer            │                │
│       │  ┌──────────┐  ┌──────────────┐  │                │
│       │  │ SQLite   │  │  ChromaDB    │  │                │
│       │  │ (Schema) │  │ (Embeddings) │  │                │
│       │  └──────────┘  └──────────────┘  │                │
│       └───────────────────────────────────┘                │
└─────────────────────────────────────────────────────────────┘

Performance

Evaluation across real-world codebases demonstrates:

Metric	Result
Query Latency	< 50ms
Collection Size	> 10,000 memories
Embedding Generation	Local (no API)
Type Validation	< 1ms

Hybrid Search

ContextFS combines semantic and keyword search:

def hybrid_search(query: str, limit: int = 10):
    # Semantic search via embeddings
    semantic_results = chromadb.query(
        query_embeddings=encode(query),
        n_results=limit * 2
    )

    # BM25 keyword search
    keyword_results = sqlite.fts_search(query, limit * 2)

    # Reciprocal Rank Fusion
    return merge_results(semantic_results, keyword_results, limit)

Conclusion

ContextFS represents a foundational step toward giving AI systems persistent, structured memory capabilities that mirror human cognitive patterns while maintaining the rigor of formal type systems.

Citation:

@article{long2026contextfs,
  title={ContextFS: A Distributed Type-Safe Memory System for AI},
  author={Long, Matthew},
  journal={YonedaAI Research},
  year={2026}
}

Download: PDF | LaTeX Source