source·shift
← all posts

The paper that proved our 5 lines of code were optimal

Every time a value can come from more than one place, an argument starts about which place wins. Settings resolvers, feature flag stacks, permission cascades, multi-tier configuration overrides: someone on the team has to remember the precedence order and explain it at every standup. The if-else chain that encodes the order is correct, but nobody can name why it is correct.

If you have shipped any agent with multiple sources of any conditioning signal (system prompts, tool palettes, memory retrieval policies, per-call configuration), you have this problem.

Inside LibWit, our version of it looks like an eight-tier chain that picks one style profile per LLM call: block override, category override, parent-chain inheritance, document setting, user global, a semantically similar profile from elsewhere in the corpus, a curated exemplar, the system default. For a year we ran it as the hand-rolled fallback chain you would expect, and every time someone proposed adding a tier the argument restarted about where it fit. Then a December 2025 paper proved the cascade was already optimal, and named the algorithm we had been writing without noticing.

Where this problem shows up in agentic AI

The “many candidates, one winner, complicated precedence” problem is everywhere once you start looking. The names are different but the shape doesn’t change.

System concernThe tiers that competeA framework where you’ve seen it
System prompt assemblydeveloper instructions, user instructions, conversation-pinned context, current tool-call context, runtime personaAnthropic Claude API’s developer/user message hierarchy; OpenAI Assistants v2’s assistant + thread + run instructions
Tool palette selectionglobally enabled tools, role-restricted tools, user-disabled tools, capability-matched tools, scope-limited tools per turnCrewAI’s role+tool binding; AutoGen’s tool registries; LangGraph node-level tools
Memory retrieval policyepisodic memory, semantic memory, skill memory, conversation buffer, scratchpad notes, knowledge graphLetta (formerly MemGPT)‘s memory blocks; Mem0’s hierarchical stores; Cognee’s tiered memory
Per-call LLM configurationmodel choice, temperature, max tokens, response format, safety mode (each can come from per-task, per-user, per-tenant, or system default)OpenAI Assistants v2’s instruction cascade; LangChain RunnableConfig merging
Safety filter stackconstitutional rules, developer-supplied policies, user opt-outs, jurisdiction-specific filters, post-hoc moderationAnthropic’s Constitutional AI layers; Llama Guard / NeMo Guardrails composition
Multi-agent persona routingrole-restricted dispatch, capability-required dispatch, cost-tier preference, fallback agentCrewAI’s hierarchical process; AutoGen’s GroupChat; LangGraph supervisor patterns

The pain reads the same way in all six. Adding a new layer kicks off an argument about where it sits in the precedence order. Edge cases produce surprising winners (“why did the assistant-level instruction lose to the run-level one?”). Adding observability is a project, because nothing in the cascade is named; the if-else chain doesn’t know it’s a cascade. The cascade pattern is universal and universally unnamed, which means every team has reached for a solution independently, and the range of answers is wide enough to be worth mapping.

How others handle it today

A quick survey of approaches in the wild, ordered roughly by formality.

ApproachWhere you’ll see itStrengthWeakness
Ad-hoc if-else cascademost production agentstrivial to write, easy to readevery change is an argument; no proof the order is correct; no observability of which branch wins
Dict-merge with right-overrides-leftLangChain RunnableConfig, OpenAI Assistants instruction mergeone line of code; works for additive fieldswrong for any field where the “more specific” source isn’t always the right winner; silently loses information
Per-field reducersLangGraph state reducers; Redux-style storesexplicit semantics per field; testablecombinatorial complexity as fields grow; no shared framework for “this field’s reducer is a precedence chain”
Importance + recency rankingLetta memory blocks; many RAG ranking pipelinesadapts to access patterns; one knob to tuneheuristic, no optimality story; sensitive to embedding noise as the store grows
Dense retrieval over the unionMem0, vanilla RAGuniform code path; scales with embeddingsprecedence dissolves into similarity; can’t express “user-level beats document-level even at lower cosine”
DAG-based config compositionDagger, Nix, Bazel rulesformal, reproducible, type-safeheavyweight; only worth it when config is itself the product
Constitutional layersAnthropic Claude’s safety stackstrict precedence is the point; transparent to the userdomain-specific; not a general resolution framework

The structure, a laminar matroid, has been sitting there unnamed: most of the agentic-AI world is still running ad-hoc if-else cascades, and even the frameworks that are explicit about precedence (Letta, the Anthropic safety stack) don’t cite the underlying mathematics.

The paper that names it

The setup Platnick, Alirezaie, and Rahnama describe in Structured Personalization: Modeling Constraints as Matroids for Data-Minimal LLM Agents is the same one we had. Personalizing an agent means picking which user-specific facts to condition on. The utility of more facts diminishes (this is the submodular property, where each new fact helps less than the last). But the choice is constrained: some facts depend on others, some categories cap how many can be selected, some hierarchies allow at most one child per parent.

The paper’s contribution is the recognition that this whole space of constraints (quotas, hierarchies, dependencies) has a name. It’s a laminar matroid, a structure mathematicians have been studying since the 1930s. Once you recognize that, three things follow:

  1. Greedy works. Sort the candidates by marginal utility, pick the first one that doesn’t violate any constraint, repeat. Plain greedy gives you at least 1/2 of the optimal. The continuous-greedy variant gives you 1 − 1/e ≈ 63.2%.
  2. The proof is older than most of CS. Edmonds wrote the matroid-greedy theorem in the 1960s, and submodular maximization is a well-trodden field. We didn’t have to prove anything about our resolver; we just had to recognize what it was.
  3. The constraint is the load-bearing part of the design, not the algorithm. Most engineering effort should go into modeling the constraints (which tier can override which, which fallbacks are valid) rather than into clever resolution logic.

The third point is the one that changed how we think about the resolver.

How we applied it: 8 tiers, in priority order

We took the framing and pointed it at the chain we already had, which by then had grown to eight tiers. The priorities are integers; lower wins. The flowchart below maps these priorities to color-coded paths.

flowchart TD
  REQ([style request]) --> PARALLEL[parallel lookups<br/>5 base tiers]
  PARALLEL --> T1[1. block override]
  PARALLEL --> T2[2. category override]
  PARALLEL --> T3[3. parent-chain walk]
  PARALLEL --> T4[4. document setting]
  PARALLEL --> T5[5. user global]

  T1 & T2 & T3 & T4 & T5 --> ANY{any non-null?}
  ANY -- yes --> PICK[priority-sort, pick first]
  ANY -- no --> LAZY[lazy fallback path<br/>embed selected text]
  LAZY --> T6[6. semantic neighbor<br/>user's own past profile]
  LAZY --> T7[7. curated exemplar]
  T6 & T7 --> GATE{exemplar similarity<br/>>= 0.55?}
  GATE -- yes --> PICK
  GATE -- no --> T8[8. system default]
  T8 --> PICK
  PICK --> OUT([resolved profile])
  PICK --> METRIC[(record tier_win,<br/>resolution_duration)]

  classDef gate fill:#7a5a1f,stroke:#fff,color:#fff
  classDef alloc fill:#1f3a7a,stroke:#fff,color:#fff
  classDef serve fill:#1f5e3a,stroke:#fff,color:#fff
  classDef store fill:#4a1f7a,stroke:#fff,color:#fff
  class ANY,GATE gate
  class PARALLEL,PICK,LAZY alloc
  class REQ,OUT serve
  class T1,T2,T3,T4,T5,T6,T7,T8 store
  class METRIC alloc

The blue is the algorithm. The yellow is the constraint logic (which the paper says is the load-bearing part). The purple is the candidate pool. The green is what the caller actually receives.

Two things in this diagram weren’t there before the matroid framing made them obvious:

With five independent lookups and a single lazy gate holding the two expensive tiers, the candidate pool already knows its own shape, and when the constraints are that clean, the selection code is almost nothing.

The algorithm fits in five lines

Once the tier chain is modeled correctly, the resolver is the smallest function in the file. Sort by priority. Walk the sorted list. Return the first non-null profile. Mark it selected for the audit trail.

function resolveGreedy(tiers, purpose) {
  const sorted = [...tiers].sort((a, b) => a.priority - b.priority);
  for (const tier of sorted) {
    if (tier.profile !== null) {
      tier.selected = true;
      return tier.profile;
    }
  }
  return getDefaultProfileForPurpose(purpose);
}

Under a laminar matroid (at most one profile per tier, with strict precedence) greedy is provably optimal; the paper’s general-case constant-factor bound collapses to “exactly optimal” for our trivial-cardinality case.

What we got out of the framing

The framing changed the conversations around the resolver, not the code; the 5-line function above was almost word-for-word what we already had.

The tier win rates answered one class of questions the framing opened up; the richer machinery in the paper, continuous greedy over a submodular objective, addresses a class we haven’t needed yet.

What this didn’t solve

One thing we expected the matroid framing to give us, that it didn’t.

What we expected was the headline bound, but our resolver picks exactly one profile (the trivial cardinality case) so the paper’s (1 − 1/e) continuous-greedy bound collapses to “highest-priority non-null wins”; the framing was the contribution, not the math. The richer machinery the paper develops (compiling a user’s knowledge graph into macro-facets, running continuous greedy over the matroid) is not what we do, and we don’t need it for this surface. If we ever want to compose multiple profile fragments instead of picking one, the paper is there.

The detection fingerprint

If your service has a precedence chain with at-most-one-per-tier semantics (a settings resolver, a feature-flag override stack, a permissions cascade, a config merge with strict overrides), you probably have a laminar matroid.

The thing that catches it is the conversation pattern: every time someone proposes adding a new layer to the chain, the discussion is about precedence ordering and override semantics, not about retrieval. That’s the matroid asking to be named. Once you name it, the algorithm stays small and the constraint model gets the attention it deserves.

The five lines were always the right amount of code. We just didn’t have the vocabulary to say so.

Comments

Sign in with GitHub to leave a comment. Threads live on SourceShift/blog-comments — moderated.