Retrieval Augmented Generation
· 4 min read
Notes on retrieval augmented generation (RAG) from Anthropic's course on Claude.

Retrieval Augmented Generation
Breaking down large documents which need to be passed to the LLM into chunks so we only send the most relevant pieces. Reduces traffic, latency and costs. Augment the generation of some text with a (pre-generation) retrieval process.

The LLM can then focus on the most relevant content and handle large documents at scale, as well as multiple docs.
Challenges with RAG:
- Requires preprocessing to chunk docs
- Need a search mechanism to find relevant chunks
- You may miss relevant chunks
- Chunking strategy policy
Chunking Strategies

Sized Based
Downsides:
- Cutoff text mid sente...
- Lack of context in each chunk
Can overlap chunks:

Structure Based
Using document structure like headers, paragraphs and sections. Good for well formatted documents like markdown (maybe not mine).
