Tagged “rag”

1 post

Designing a RAG and embeddings backend
RAG demos are easy and RAG in production is hard, and the reason is always the same: retrieval, not generation, is the bottleneck. Here's how I designed the embeddings backend for a multi-agent system where the retrieval layer is the difference between agents that remember and agents that hallucinate.
Jun 7, 2026|6 min read