How Vector Search Powers Semantic Memory
A deep dive into how ContextFS uses ChromaDB and sentence-transformers to enable semantic search across your memories—finding relevant context even when you don't remember the exact words.
How Vector Search Powers Semantic Memory
Traditional keyword search has a fundamental problem: you need to know the exact words to find what you're looking for. But human memory doesn't work that way. We remember concepts, not keywords.
ContextFS uses vector search to enable semantic memory—finding relevant context based on meaning, not just matching words.
The Problem with Keyword Search
Imagine you saved a memory about using "JWT tokens for authentication." Later, you search for "login security"—a keyword search returns nothing, even though the concepts are closely related.
This is the vocabulary mismatch problem, and it's why traditional search often fails for knowledge management.
How Vector Search Works
Vector search solves this by converting text into numerical representations called embeddings. These embeddings capture semantic meaning, so similar concepts end up close together in vector space.
Step 1: Generate Embeddings
When you save a memory, ContextFS generates an embedding using sentence-transformers:
from sentence_transformers import SentenceTransformer
model = SentenceTransformer('all-MiniLM-L6-v2')
embedding = model.encode("Use JWT tokens for authentication")
# Returns a 384-dimensional vector
Step 2: Store in Vector Database
The embedding is stored in ChromaDB alongside the original text:
collection.add(
documents=["Use JWT tokens for authentication"],
embeddings=[embedding],
ids=["memory-123"]
)
Step 3: Search by Similarity
When you search, your query is also converted to an embedding, and ChromaDB finds the closest matches:
query_embedding = model.encode("login security")
results = collection.query(
query_embeddings=[query_embedding],
n_results=5
)
Even though "login security" shares no words with "JWT tokens for authentication," the embeddings are similar because the concepts are related.
Hybrid Search: Best of Both Worlds
Pure vector search isn't perfect. Sometimes you want exact keyword matches. ContextFS uses hybrid search that combines:
- Vector similarity for semantic matching
- BM25 keyword search for exact matches
- Reciprocal Rank Fusion to merge results
This gives you the best of both approaches.
Performance Optimizations
Vector search can be slow with large datasets. ContextFS uses several optimizations:
Local Embeddings
Embeddings are generated locally using sentence-transformers—no API calls needed. The all-MiniLM-L6-v2 model is small (80MB) and fast.
ChromaDB with SQLite
ChromaDB stores vectors in an optimized format with HNSW indexing for fast approximate nearest neighbor search.
Caching
Query embeddings are cached to avoid regenerating them for repeated searches.
The Result
With these techniques, ContextFS achieves:
- Sub-50ms queries for typical memory collections
- Semantic matching that finds related concepts
- Zero API dependency for privacy and offline use
Try It Yourself
Install ContextFS and experience semantic memory:
pip install contextfs
# Save some memories
contextfs save "Use PostgreSQL for the database" -t decision
contextfs save "React with TypeScript for the frontend" -t decision
# Semantic search finds related memories
contextfs search "what database should I use"
contextfs search "frontend tech stack"
Want to learn more? Check out our documentation or source code.