<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"><channel><title>SourceShift</title><description>Engineering notes from the SourceShift team — bug post-mortems, LLM gateway scars, and the occasional working theory. Drafted from real production fires at blog.sourceshift.io.</description><link>https://blog.sourceshift.io/</link><item><title>The chapter that forgot why it existed</title><link>https://blog.sourceshift.io/p/the-chapter-that-forgot-why-it-existed/</link><guid isPermaLink="true">https://blog.sourceshift.io/p/the-chapter-that-forgot-why-it-existed/</guid><description>When an LLM agent generates text without a world model, it forgets its own goal mid-task. The fix is not more context.</description><pubDate>Mon, 08 Jun 2026 00:00:00 GMT</pubDate></item><item><title>Three LLM judges, but really 1.5: why a same-family panel collapses to noise</title><link>https://blog.sourceshift.io/p/three-llm-judges-but-really-1-5-why-a-same-family-panel-collapses-to-noise/</link><guid isPermaLink="true">https://blog.sourceshift.io/p/three-llm-judges-but-really-1-5-why-a-same-family-panel-collapses-to-noise/</guid><description>I needed to settle a disagreement between two LLM reviews of the same design doc. The clean answer was a 3-judge panel. The honest answer is that the panel I built is one rubric-design move away from being a beautifully-instrumented yes-man.</description><pubDate>Thu, 04 Jun 2026 20:00:00 GMT</pubDate></item><item><title>MiniMax-M3: The tier-2 coder that found its niche</title><link>https://blog.sourceshift.io/p/minimax-m3-the-tier-2-coder-that-found-its-niche/</link><guid isPermaLink="true">https://blog.sourceshift.io/p/minimax-m3-the-tier-2-coder-that-found-its-niche/</guid><description>A frontier-coding model with a 1M context window, a new sparse attention mechanism, and a $0.30/M price tag landed on June 1st. We routed five real epics through it the same day. Here is what stuck.</description><pubDate>Mon, 01 Jun 2026 20:00:00 GMT</pubDate></item><item><title>How we built our user-profile system — the canonical six-layer pattern behind every personalized LLM call</title><link>https://blog.sourceshift.io/p/how-we-built-our-user-profile-system-the-canonical-six-layer-pattern-behind-every-personalized-llm-call/</link><guid isPermaLink="true">https://blog.sourceshift.io/p/how-we-built-our-user-profile-system-the-canonical-six-layer-pattern-behind-every-personalized-llm-call/</guid><description>A natural-language paragraph the AI silently reads on every call. Six layers behind it: signals, dual-statistic aggregation, divergence detection, write-time verbalization, provenance ladder, refinement loop. Grounded in four 2025 papers that converged on the same shape.</description><pubDate>Wed, 27 May 2026 00:00:00 GMT</pubDate></item><item><title>We Ran a 3-Source Bug Hunt. Then We Realised Our Validators Were All Claude.</title><link>https://blog.sourceshift.io/p/we-ran-a-3-source-bug-hunt-then-we-realised-our-validators-were-all-claude/</link><guid isPermaLink="true">https://blog.sourceshift.io/p/we-ran-a-3-source-bug-hunt-then-we-realised-our-validators-were-all-claude/</guid><description>Multi-agent code review converged on a confident verdict. The literature had a name for why we should not believe it.</description><pubDate>Tue, 26 May 2026 00:00:00 GMT</pubDate></item><item><title>Why one agent isn&apos;t enough to find your bugs</title><link>https://blog.sourceshift.io/p/why-one-agent-isnt-enough-to-find-your-bugs/</link><guid isPermaLink="true">https://blog.sourceshift.io/p/why-one-agent-isnt-enough-to-find-your-bugs/</guid><description>Four specialists at ρ ≤ 0.25 beat one generalist by 40 percentage points. Our five-agent swarm hit the wall anyway. Here is what the papers actually require.</description><pubDate>Tue, 26 May 2026 00:00:00 GMT</pubDate></item><item><title>The agent is not a transaction</title><link>https://blog.sourceshift.io/p/the-agent-is-not-a-transaction/</link><guid isPermaLink="true">https://blog.sourceshift.io/p/the-agent-is-not-a-transaction/</guid><description>Pause, resume, and mid-flight steering for long-horizon agent runs. The 2026 literature just named the stream paradigm we built by hand.</description><pubDate>Mon, 18 May 2026 19:30:00 GMT</pubDate></item><item><title>Piaget for prompt agents: why our long-form memory borrows from constructivist psychology</title><link>https://blog.sourceshift.io/p/piaget-for-prompt-agents-why-our-long-form-memory-borrows-from-constructivist-psychology/</link><guid isPermaLink="true">https://blog.sourceshift.io/p/piaget-for-prompt-agents-why-our-long-form-memory-borrows-from-constructivist-psychology/</guid><description>Composing CAM + CAMEL + FadeMem so a book-writing agent has structured memory, healthy decay, and no quiet bias amplification.</description><pubDate>Sat, 16 May 2026 16:00:00 GMT</pubDate></item><item><title>Subagents as a context-budget primitive</title><link>https://blog.sourceshift.io/p/subagents-as-a-context-budget-primitive/</link><guid isPermaLink="true">https://blog.sourceshift.io/p/subagents-as-a-context-budget-primitive/</guid><description>A subagent is not a workflow node. It is a budget envelope. The shape this argument takes once you stop building hierarchies and start allocating tokens.</description><pubDate>Fri, 15 May 2026 08:00:00 GMT</pubDate></item><item><title>Two prompt frameworks, one runtime: how we adopted BAML without giving up our cost ledger</title><link>https://blog.sourceshift.io/p/two-prompt-frameworks-one-runtime-how-we-adopted-baml-without-giving-up-our-cost-ledger/</link><guid isPermaLink="true">https://blog.sourceshift.io/p/two-prompt-frameworks-one-runtime-how-we-adopted-baml-without-giving-up-our-cost-ledger/</guid><description>BAML wants to own the wire. Our harness already does. We ran them side-by-side in &quot;modular mode&quot;: BAML for render and parse, the harness for resolution, telemetry, and cost. Here is why and how — and why the 2026 burden-allocation literature says it was the principled choice, not a pragmatic compromise.</description><pubDate>Wed, 13 May 2026 10:23:39 GMT</pubDate></item><item><title>What 170 papers agreed on about deep research agents</title><link>https://blog.sourceshift.io/p/what-170-papers-agreed-on-about-deep-research-agents/</link><guid isPermaLink="true">https://blog.sourceshift.io/p/what-170-papers-agreed-on-about-deep-research-agents/</guid><description>Five surveys, one consensus shape: a four-stage pipeline, three taxonomy splits, six recurring failure modes. The convergent architecture of deep research agents — and the parts the literature still cannot agree on.</description><pubDate>Sun, 10 May 2026 12:00:00 GMT</pubDate></item><item><title>Mini-ork: A year of autonomous parallel feature delivery on a solo-founder codebase</title><link>https://blog.sourceshift.io/p/mini-ork-a-year-of-autonomous-parallel-feature-delivery-on-a-solo-founder-codebase/</link><guid isPermaLink="true">https://blog.sourceshift.io/p/mini-ork-a-year-of-autonomous-parallel-feature-delivery-on-a-solo-founder-codebase/</guid><description>How a small orchestration loop wrapped around Claude Code grew into a multi-track delivery system with measurable cost, reliability, and throughput wins — anchored in the 2026 multi-agent literature.</description><pubDate>Mon, 04 May 2026 14:10:32 GMT</pubDate></item><item><title>Probe before dispatch: the routing pattern we built without knowing it had a name</title><link>https://blog.sourceshift.io/p/probe-before-dispatch-the-routing-pattern-we-built-without-knowing-it-had-a-name/</link><guid isPermaLink="true">https://blog.sourceshift.io/p/probe-before-dispatch-the-routing-pattern-we-built-without-knowing-it-had-a-name/</guid><description>Five months of manual probing turned into seven shipped features. The literature had a name for what we were doing. We missed it for half a year.</description><pubDate>Sat, 02 May 2026 09:00:00 GMT</pubDate></item><item><title>Our prompt canary was lying to us</title><link>https://blog.sourceshift.io/p/our-prompt-canary-was-lying-to-us/</link><guid isPermaLink="true">https://blog.sourceshift.io/p/our-prompt-canary-was-lying-to-us/</guid><description>A 5% A/B that hid a 1.8× cost regression, and the two 2026 papers that named the fix: multi-objective Thompson sampling with a calibration gate.</description><pubDate>Fri, 24 Apr 2026 12:43:05 GMT</pubDate></item><item><title>The paper that proved our 5 lines of code were optimal</title><link>https://blog.sourceshift.io/p/the-paper-that-proved-our-5-lines-of-code-were-optimal/</link><guid isPermaLink="true">https://blog.sourceshift.io/p/the-paper-that-proved-our-5-lines-of-code-were-optimal/</guid><description>A 2025 paper formalized our 8-tier style profile chain as a laminar matroid. The greedy resolver we already had was the right answer.</description><pubDate>Wed, 22 Apr 2026 10:45:36 GMT</pubDate></item><item><title>We stopped treating context like application logic</title><link>https://blog.sourceshift.io/p/we-stopped-treating-context-like-application-logic/</link><guid isPermaLink="true">https://blog.sourceshift.io/p/we-stopped-treating-context-like-application-logic/</guid><description>Six tables, three plug-in layers, one compose call. The substrate every block-shaped feature on LibWit now plugs into — and the reversibility lens that decided what to lock on day one.</description><pubDate>Wed, 15 Apr 2026 10:00:00 GMT</pubDate></item><item><title>Our prompts stopped being code</title><link>https://blog.sourceshift.io/p/our-prompts-stopped-being-code/</link><guid isPermaLink="true">https://blog.sourceshift.io/p/our-prompts-stopped-being-code/</guid><description>How we built a prompt harness — registration, version control, four-tier resolution, execution ledger, feedback loop — and what the 2026 literature has been calling the same shape.</description><pubDate>Mon, 05 Jan 2026 17:10:40 GMT</pubDate></item><item><title>The simplest survivable form of chat memory</title><link>https://blog.sourceshift.io/p/the-simplest-survivable-form-of-chat-memory/</link><guid isPermaLink="true">https://blog.sourceshift.io/p/the-simplest-survivable-form-of-chat-memory/</guid><description>Two prompts, one Postgres column, a 6-message threshold. How our chat sessions keep coherence past the context window without hierarchical buffers or vector search.</description><pubDate>Sun, 04 Jan 2026 19:00:00 GMT</pubDate></item><item><title>We built attractor-basin memory before the paper named the problem</title><link>https://blog.sourceshift.io/p/we-built-attractor-basin-memory-before-the-paper-named-the-problem/</link><guid isPermaLink="true">https://blog.sourceshift.io/p/we-built-attractor-basin-memory-before-the-paper-named-the-problem/</guid><description>Mid-2025: context management for LLM agents was a vector DB plus message-history glue. We built ContextNest&apos;s attractor-basin substrate as the organization layer that was missing — and the 2026 paper that later named the failure mode we had been heading off makes the bet legible.</description><pubDate>Sun, 20 Jul 2025 09:14:38 GMT</pubDate></item></channel></rss>