#local-inference

inspiration
Embeddings Are Not Optional

Every RAG pipeline, semantic search index, and similarity feature runs on embeddings. The generation model gets the credit. The embedding model does the work.
May 31, 2026
booklet
OpenCode with Local Models: Pointing Your Coding Agent at Your Own Inference

OpenCode is a terminal-first AI coding agent. It expects cloud APIs by default. This booklet shows how to wire it to Ollama, vLLM, or any OpenAI-compatible local endpoint — and what breaks when you do.
May 27, 2026

Embeddings Are Not Optional