How Vector Search Powers Semantic Memory

Traditional keyword search has a fundamental problem: you need to know the exact words to find what you're looking for. But human memory doesn't work that way. We remember concepts, not keywords.

ContextFS uses vector search to enable semantic memory—finding relevant context based on meaning, not just matching words.

The Problem with Keyword Search

Imagine you saved a memory about using "JWT tokens for authentication." Later, you search for "login security"—a keyword search returns nothing, even though the concepts are closely related.

This is the vocabulary mismatch problem, and it's why traditional search often fails for knowledge management.

How Vector Search Works

Vector search solves this by converting text into numerical representations called embeddings. These embeddings capture semantic meaning, so similar concepts end up close together in vector space.

Step 1: Generate Embeddings

When you save a memory, ContextFS generates an embedding using sentence-transformers:

from sentence_transformers import SentenceTransformer

model = SentenceTransformer('all-MiniLM-L6-v2')
embedding = model.encode("Use JWT tokens for authentication")
# Returns a 384-dimensional vector

Step 2: Store in Vector Database

The embedding is stored in ChromaDB alongside the original text:

collection.add(
    documents=["Use JWT tokens for authentication"],
    embeddings=[embedding],
    ids=["memory-123"]
)

Step 3: Search by Similarity

When you search, your query is also converted to an embedding, and ChromaDB finds the closest matches:

query_embedding = model.encode("login security")
results = collection.query(
    query_embeddings=[query_embedding],
    n_results=5
)

Even though "login security" shares no words with "JWT tokens for authentication," the embeddings are similar because the concepts are related.

Hybrid Search: Best of Both Worlds

Pure vector search isn't perfect. Sometimes you want exact keyword matches. ContextFS uses hybrid search that combines:

Vector similarity for semantic matching
BM25 keyword search for exact matches
Reciprocal Rank Fusion to merge results

This gives you the best of both approaches.

Performance Optimizations

Vector search can be slow with large datasets. ContextFS uses several optimizations:

Local Embeddings

Embeddings are generated locally using sentence-transformers—no API calls needed. The all-MiniLM-L6-v2 model is small (80MB) and fast.

ChromaDB with SQLite

ChromaDB stores vectors in an optimized format with HNSW indexing for fast approximate nearest neighbor search.

Caching

Query embeddings are cached to avoid regenerating them for repeated searches.

The Result

With these techniques, ContextFS achieves:

Sub-50ms queries for typical memory collections
Semantic matching that finds related concepts
Zero API dependency for privacy and offline use

Try It Yourself

Install ContextFS and experience semantic memory:

pip install contextfs

# Save some memories
contextfs save "Use PostgreSQL for the database" -t decision
contextfs save "React with TypeScript for the frontend" -t decision

# Semantic search finds related memories
contextfs search "what database should I use"
contextfs search "frontend tech stack"

Want to learn more? Check out our documentation or source code.