northerndev 2 hours ago

I've been building local agents and found debugging the RAG retrieval step frustrating. I often couldn't tell why the LLM was pulling specific context chunks, and console logging vector arrays didn't help.

I built this tool to act as a standalone 'memory server' sitting on top of PostgreSQL with the pgvector extension. I wanted to avoid managing separate specialized vector DBs for smaller projects.

The main feature is the visualizer dashboard. It shows the retrieval process in real-time, displaying raw chunks, similarity scores, and how 'recency decay' influences the final ranking.

The backend is Node.js/TypeScript using Prisma. It runs via Docker Compose.

Current limitation: The default config relies on OpenAI for embedding generation. I am working on adding local support via Ollama bindings as the next priority so the entire stack can run offline.

The code is MIT licensed.