AI Engineer - Curated Playlist

June 27, 2026 · 10 min read

Sanjeev Sarda

High End Engineering

Here's a selection of interesting YouTube videos which are part of the AI Engineer series.

AI Engineer

Diligent declaration: AI assisted article.

Building and Architecting AI Agents

Defying Gravity - Kevin Hou, Google DeepMind
- Why we built Google Antigravity, and discussing the future of agentic IDEs with Gemini 3.
How We Build Effective Agents: Barry Zhang, Anthropic
- Insights and strategies from Anthropic on how to architect and implement highly effective AI agents.
12-Factor Agents: Patterns of reliable LLM applications — Dex Horthy, HumanLayer
- An exploration of reliable patterns and the "12-factor" methodology applied to building LLM applications and agents.
Harness Engineering: How to Build Software When Humans Steer, Agents Execute — Ryan Lopopolo, OpenAI
- Guidelines and methodologies from OpenAI for constructing software systems where human developers guide the goals while autonomous agents execute the tasks.
The Multi-Agent Architecture That Actually Ships — Luke Alvoeiro, Factory
- Everyone's building multi-agent systems, but nobody agrees on how.
Architecting Agent Memory: Principles, Patterns, and Best Practices — Richmond Alake, MongoDB
- In the rapidly evolving landscape of agentic systems, memory management has emerged as a key pillar for building intelligent, context-aware AI Agents.
Building AI Agents that actually automate Knowledge Work - Jerry Liu, LlamaIndex
- Agents are all the rage in 2025, and every single b2b SaaS startup/incumbent promises AI agents that can "automate work" in some way. But how do you actually build this? The answer is two fold: 1.
3 ingredients for building reliable enterprise agents - Harrison Chase, LangChain/LangGraph
- It's easy to build a prototype of an agent, but hard to put an agent in production - especially in an enterprise setting.
Rise of the AI Architect — Clay Bavor, Cofounder, Sierra w/ Alessio Fanelli
- As the amount of consumer facing AI products grows, the most forward leaning enterprises have created a new role: the AI Architect.
From Chaos to Choreography: Multi-Agent Orchestration Patterns That Actually Work — Sandipan Bhaumik
- One AI agent is a feature. Fifty agents is a distributed systems problem nobody's discussing.
Why (Senior) Engineers Struggle to Build AI Agents — Philipp Schmid, Google DeepMind
- A deleteItem endpoint is obvious to the developer who built it. An agent only sees the function schema and docstring.
Scaling Agents for Gen AI Products - Anju Kambadur, Bloomberg Head of AI Engineering
- Architectural lessons and practical advice from Bloomberg on scaling generative AI agent systems for diverse enterprise data products.
How we solved Context Management in Agents — Sally-Ann Delucia
- The naive solution is truncation. The obvious solution is summarization.
Ralph Loops: Build Dumb AI Loops That Ship — Chris Parsons, Cherrypick
- Dumb loops beat clever workflows. Most teams building with AI agents reach for multi-agent orchestration, planning graphs, and elaborate tool chains. Then they spend months debugging them.
Agents in Production: How OpenGov Built and Scaled OG Assist - Gabe De Mesa, OpenGov
- Come and learn about building AI Agents in production. Learn hands-on directly with the AI Agents team from OpenGov which powers AI workflows across thousands of state and local governments.
The Production AI Playbook: Deploying Agents at Enterprise Scale — Sandipan Bhaumik, Databricks
- A retail bank spent £85,000 over six months on a chatbot PoC that could not reach production. No one could explain why it was failing.
Your Attention Is the Bottleneck, Not Your Agents — Zack Proser, WorkOS
- Simon Willison fires up four parallel agents and is wiped out by 11am.
Building Agent Interfaces: Lessons from Chrome DevTools (MCP) for Agents — Michael Hablich, Google
- Chrome DevTools MCP shipped with one tool: debug_webpage. Agents failed silently because they couldn't compose behaviors.
Why your agents need decision traces, not just documents — Zach Blumenfeld, Neo4j
- A knowledge base tells a financial analyst agent the risk factors.
Context Graphs for Explainable, Decision-Aware AI Agents — Andreas Kollegger & Zaid Zaim, Neo4j
- Prescribing drug X is correct 99% of the time for symptom Y. For the 1% where it is fatal, statistical reasoning does not help you.

AI Models and Research

Everything I Learned Training Frontier Small Models — Maxime Labonne, Liquid AI
- A new class of small models is emerging with the ability to reliably follow instructions and call tools while running on-device under 1 GB of memory.
How Google DeepMind is researching the next Frontier of AI for Gemini — Raia Hadsell, VP of Research
- In this presentation, Raia Hadsell, VP of Research at Google DeepMind and AI Ambassador for the United Kingdom, opens AIE Europe and explores what's open in Frontier AI and the future of intelligen...
Gemma 4 Deep Dive — Cassidy Hardin, Researcher, Google DeepMind
- Open models are getting smaller, faster, and far more capable.
Sovereign Escape Velocity: Ownership w Open Models — Gus Martins, & Ian Ballantyne, Google DeepMind
- Gemma 4's 31B model sits fourth on the LM Arena open model leaderboard. The models around it are at least twice as large; some are 20 times larger. It runs on a single GPU.
Let's go Bananas with GenMedia — Guillaume Vernade, Google DeepMind
- Guillaume Vernade from Google DeepMind takes a public domain book and runs it through the full gen media stack live.

Developer Productivity and Workflows

No Vibes Allowed: Solving Hard Problems in Complex Codebases – Dex Horthy, HumanLayer
- It seems pretty well-accepted that AI coding tools struggle with real production codebases.
Does AI Actually Boost Developer Productivity? (100k Devs Study) - Yegor Denisov-Blanch, Stanford
- Forget vendor hype: Is AI actually boosting developer productivity, or just shifting bottlenecks? Stop guessing.
Agentic Engineering: Working With AI, Not Just Using It — Brendan O'Leary
- Coding agents are quickly moving from novelty to necessity, but most teams are still stuck between demos that feel magical and systems that break down in real-world engineering environments.
Moving away from Agile: What's Next – Martin Harrysson & Natasha Maniar, McKinsey & Company
- Most enterprises are not capturing much value from AI in software dev to date (at least relative to the potential).
AI Engineering at Jane Street - John Crepezzi
- Programmers using mainstream languages enjoy a wealth of intelligent coding assistants and tools.
AI changes Nothing — Dax Raad, OpenCode
- Everyone says AI changes everything. Dax Raad argues that when it comes to building a winning product, AI changes nothing.
Dispatch from the Future: building an AI-native Company – Dan Shipper, Every, AI & I
- The central thesis is that there is a "10x difference" between an organization where 90% of engineers use AI versus one where 100% do.
Making Codebases Agent Ready – Eno Reyes, Factory AI
- Agents are eating software engineering. Yet teams deploying these tools face mixed results.
Spec-Driven Development: Agentic Coding at FAANG Scale and Quality — Al Harris, Amazon Kiro
- In the AI coding era, we have powerful tools, but tools still require honing to work effectively.
The Agent Development Life Cycle — Zack Reneau-Wedeen, Sierra
- Compared to traditional software, LLMs are creative, flexible, unpredictable, expensive, and slow. A new kind of software demands a new approach to development.
Can you prove AI ROI in Software Eng? (Stanford 120k Devs Study) – Yegor Denisov-Blanch, Stanford
- You’re investing millions in AI for software engineering.
BDD, ADR, PRD, WTF: Capturing Decisions for Humans and AI Alike — Michal Cichra, Safe Intelligence
- "One thing harder than reading AI code is reading AI tests.

LLM Capabilities and RAG

Building Production-Ready RAG Applications: Jerry Liu
- Large Language Models (LLM's) are starting to revolutionize how users can search for, interact with, and generate new content.
RAG Agents in Prod: 10 Lessons We Learned — Douwe Kiela, creator of RAG
- The latest generation of LLMs is demonstrating impressive test time reasoning capabilities.
RAG is dead, right?? — Kuba Rogut, Turbopuffer
- Cursor added semantic search and measured a 24% increase in answer accuracy on their composer model, a 2.6% gain in code retention in large codebases, and a 2.2% drop in dissatisfied user requests.
Personalization in the Era of LLMs - Shivam Verma, Spotify
- Spotify represents Ariana Grande and Bruno Mars as sequences of six tokens. The first two are shared because both are pop artists. The remaining tokens diverge to capture what makes each distinct.

Coding Agents in Practice

How Claude Code Works - Jared Zoneraich, PromptLayer
- Deep dive into what we have independently figured out about the architecture and implementation of Claude's code generation capabilities. Not officially endorsed by Anthropic.
The Art & Science of Benchmarking Agents — Vincent Chen, Snorkel AI
- ARC AGI 3 launched a few weeks before this talk with every task human solvable and frontier models under 1%.
SWE-rebench: Lessons from Evaluating Coding Agents — Ibragim Badertdinov, Nebius
- Claude Code solved SWE rebench tasks by reading git history to find the solution patch. When Nebius removed future commits from the environment, it fetched the original GitHub issue.
How Lovable self-improves every hour — Benjamin Verbeek, Lovable
- Within the first hour of launching the vent tool, the agent filed 20 complaints about a silent file copy failure. The team checked: the tool worked fine.
A Piece of Pi: Embedding The OpenClaw Coding Agent In Your Product — Matthias Luebken, Tavon
- OpenClaw feels like it's learning: it discovers capabilities, stitches tools together, builds solutions it wasn't explicitly taught.

UI, Web and Interaction

Beyond Components: Designing Generative UI for MCP Apps — Ruben Casas, Postman
- Ruben Casas from Postman prompted a model to rewrite his blog. It built a search box with a blur animation and accessibility out of the box, without being asked.
tldraw.computer - Steve Ruiz, tldraw
- Learn about tldraw's latest experiments with AI on an infinite canvas.
Paperclip: Open Source Human Control Plane for AI Labor — Dotta Bippa
- Curator note: Dotta is anonymous, so we asked him to submit with just an avatar. He did amazing! Paperclip enables open source orchestration for zero-human companies.
The agent-ready web: Simplify user actions with WebMCP — Tara Agyemang, Google
- Buying two concert tickets costs an AI agent the entire DOM, the accessibility tree, a screenshot, pixel coordinate math, and then a click that might miss because an ad just loaded and shifted the ...
Building Interactive UIs in VS Code with MCP Apps — Marlene Mhangami & Liam Hampton, GitHub
- The demo profiles a Go app running bubble sort and Fibonacci and the result renders as an interactive flame graph directly inside the VS Code chat window. Not a link. Not a text summary.

Building and Architecting AI Agents​

AI Models and Research​

Developer Productivity and Workflows​

LLM Capabilities and RAG​

Coding Agents in Practice​

UI, Web and Interaction​

Building and Architecting AI Agents

AI Models and Research

Developer Productivity and Workflows

LLM Capabilities and RAG

Coding Agents in Practice

UI, Web and Interaction