Annotations and LLMs
Notes and ideas on annotations and LLMs. Using annotations in conjunction with LLM dev tooling as well as generating annotation processors with LLMs
Notes and ideas on annotations and LLMs. Using annotations in conjunction with LLM dev tooling as well as generating annotation processors with LLMs
I never write about caches and caching, so I thought I'd cover some basics on LLM caching. Covers inference and prompt caching.
Notes on the TensorZero LLM gateway. Covers templates, schemas, feedback, retries, evals, DICL, MIPRO, model-prompt-inference optimization.
Some basics on Ollama. Includes some details on quantization, vector DBs, model storage, model format and modelfiles.
Comparisons of the OpenAI service offering with that of Anthropic. Includes context window, rate limits and model optimization.
My notes on the design of Anthropic's APIs and some general design considerations for provider based APIs and SDKs. Covers rate limiting, service tiers, SSE flow and some of the REST API endpoints.
Notes on "CAPO: Cost Aware Prompt Optimization" (June 2025) from the Munich Center for Machine Learning.
Notes on AI/LLM guardrails and safety patterns from a book on "Agentic Design Patterns" by one of Google's Distinguished Engineers, Antonio Gulli.
Notes on workflows and agents from Anthropic's course on Claude. Covers evaluator-optimizer pattern, chaining, routing, parallelization.
Notes on Claude Code and Computer Use from Anthropic's course on Claude. Covers workflow patterns.