The Comparator Trap

When the right competitor makes the collapse worse — and what actually rescues it

May 28, 2026

Quick orientation for anyone new to this work. AI assistants — ChatGPT, Claude, Gemini, DeepSeek — are becoming the default tool consumers use to compare brands. When the AI compares two brands for someone, it has to pick the comparison frame: which dimensions matter, which features get weighed, which competitor counts as “similar enough.” The frame the AI picks shapes everything that follows. As more buying decisions get routed through AI, the choice of comparison frame stops being a research nuance and becomes a market force. This article is about what happens when you try to fix that frame by handing the AI a more structurally accurate competitor — and what the data showed when we actually tried it.

In April we published an analysis of the Category Label Trap: AI shopping agents reach for the dominant Western brand whenever they encounter a category label, and the structure of that dominant brand silently overwrites the structure of the brand they were supposed to evaluate. VkusVill is not Whole Foods. Calbee is not Frito-Lay. Roshen is not Cadbury. The category template wins.

The article ended on a prescription: pair your brand with a structurally correct comparator instead of the obvious one. Tell the agent that VkusVill is closer to Trader Joe’s than to Whole Foods. Tell it that Calbee is closer to a same-culture Japanese snack maker than to Frito-Lay. Tell it that Roshen is closer to a multi-category US confectionery conglomerate than to a chocolate-only British brand. The intuition was clean: a corrective comparator should pull the AI’s perception toward the structurally correct reference class.

We tested it. The intuition was wrong.

What the corrective experiment actually showed

Run 10 of the R15 study was a within-focal-brand paired comparison. For each of three brands — VkusVill, Calbee, Roshen — we asked seven AI models to evaluate two pair conditions: the original Anglophone-template pair, and the corrective same-class pair. Each cell ran three times. 126 total API calls. The output was the cross-model mean Dimensional Collapse Index (DCI), the share of the brand’s perceived weight that the AI assigns to Economic and Semiotic dimensions at the expense of Narrative, Ideological, Cultural, Temporal, Experiential, and Social.

Lower DCI means the AI is preserving more of the brand’s actual dimensional richness. Higher DCI means the AI is flattening the brand toward “price tier and label recognition” — the metameric default we documented in the original R15 paper.

Table 1: Corrective comparator experiment.

Two of the three brands (Calbee, Roshen) showed essentially no change. The corrective comparator was neither better nor worse than the original — the brand’s collapse pattern is comparator-robust at the fraction-of-a-DCI-point level. Either the AI’s perception of these brands is already locked into a default shape that the comparator cannot perturb, or the structural alternatives we picked were not different enough from the originals in dimensions the AI tracks.

VkusVill went the other way. The corrective comparator did not help; it made the collapse measurably worse. Pairing VkusVill with Trader Joe’s (the structurally correct US analog) instead of Whole Foods (the structurally incorrect default) raised the cross-model mean DCI by 7.4 points. That is the largest single-pair comparator effect we have seen in any R15 condition.

Worse, the increase was concentrated on the dimension that matters most for VkusVill’s actual identity. The Ideological dimension — VkusVill’s positioning on clean food, supplier relationships, ingredient transparency, and consumer trust — collapsed by an additional 5.95 points when we swapped the comparator from Whole Foods to Trader Joe’s. The brand looked less differentiable on its core distinguishing dimension when paired with the supposedly more correct competitor.

The intuition that drove the recommendation in the Category Label Trap article was that comparators are baselines. Replace the wrong baseline with the right baseline, and the brand’s profile recovers. The data says comparators are not baselines. They are normalizers. The AI does not just position the focal brand relative to the comparator; it also redistributes dimension weight away from dimensions where the comparator is strong. Trader Joe’s has a vivid Ideological identity in AI training data — the whimsical staff, the no-advertising principle, the eco-claims. When the AI weights VkusVill against Trader Joe’s, it is implicitly saying “Trader Joe’s already owns Ideological in this category neighbourhood, so VkusVill cannot be the Ideological one.” The Ideological dimension flows out of VkusVill and into the comparator.

This is the comparator trap. Picking the structurally more accurate competitor can hurt the focal brand on its strongest dimension because the AI cannot simultaneously assign that dimension to both brands in a paired comparison. The dimension is a finite resource within the comparison frame, and the comparator with stronger discourse density wins.

What rescues the dimension

The natural follow-up question: if comparator swap does not work, what does? The original Category Label Trap article argued that the deeper fix is structured Brand Function metadata — a machine-readable document that describes the brand’s dimensional commitments directly rather than positioning relative to a competitor. The article asserted this. We had not tested it.

The follow-up experiment tests it. We constructed a Brand Function specification for VkusVill — a structured JSON document with explicit dimensional claims about the brand’s product model, cultural positioning, customer base, ideological commitments, economic positioning, and operational architecture. We then ran four conditions across seven models, three runs each:

Table 2: Brand Function specification rescue.

The pattern is striking on one dimension and mixed on the other. Total DCI does not return to the no-spec baseline of 28.9 — it stabilizes around 33-34 with the spec applied, slightly above baseline regardless of which comparator the spec is paired with. The Brand Function specification does not undo the general comparator-induced collapse.

But the Ideological dimension specifically — the dimension that the corrective comparator destroyed — recovers fully. With the Trader Joe’s comparator and no spec, the Ideological weight collapses to 8.4. With the Trader Joe’s comparator and the spec, it recovers to 14.8 — back to the original baseline level. The spec rescues the dimension that the corrective comparator alone takes away.

The interpretation is that the Brand Function specification gives the AI an explicit dimensional claim that overrides the implicit comparator-driven redistribution. When the spec says VkusVill’s Ideological positioning is grounded in clean-food sourcing and small-farm relationships, the AI does not have to choose between assigning Ideological to VkusVill or to Trader Joe’s. The spec asserts the dimensional content directly, and the AI honors the assertion at the dimension where the spec is most concrete. Total DCI stays elevated because the spec does not address every dimension with equal density, but the targeted dimension recovers.

This is a more useful prescription than the original article gave. The spec is not a magic wand that resets the collapse. It is a targeted instrument that rescues the specific dimensions where it makes concrete claims. A practitioner who wants to protect a brand’s distinctive Ideological identity from comparator-induced collapse needs to put explicit Ideological content in the specification — not just relative positioning, but first-order claims about the brand’s actual ideological commitments.

We also ran the same test with native Russian-language instructions wrapped around the same English-language specification. The Russian conditions showed no consistent benefit over English: total DCI was within noise of the English conditions, and Ideological recovery was the same. This is consistent with the broader R15 finding that native-language prompting does not produce a detectable aggregate effect on dimensional collapse (H10, 58/121 model-pair combinations positive, mean reduction near zero, sign test p = .716). The collapse is structural, not linguistic. The fix has to operate on the specification, not on the language envelope.

Comparator-locked vs comparator-free brands

The other useful finding from Run 10 is the asymmetry between VkusVill and the other two focal brands. Calbee and Roshen are comparator-robust: switching their comparator from the wrong default to the structurally correct alternative produces essentially no change in DCI. VkusVill is comparator-sensitive: the same swap moves DCI by 7.4 points and reshapes the dimensional profile substantially.

The mechanism is probably discourse density. VkusVill has a clearly anchored mapping in the AI’s training data — there is enough “VkusVill is the Russian Whole Foods” content out there to give the AI a strong default association. When we tell the AI to compare VkusVill to Trader Joe’s instead, we are pulling against an active anchor, and the comparator competition for dimensional content kicks in. The brand becomes visible to the comparator effect because the AI’s prior is strong enough to perturb.

Calbee and Roshen do not have equally strong default anchors. Calbee in AI training data is “Japanese snack maker” with relatively diffuse comparator content. Roshen is “Ukrainian confectionery” with even thinner comparator content. The AI’s perception of these brands settles into a category-default pattern that does not depend much on which comparator we name. Switching from Frito-Lay to Koikeya or from Cadbury to Hershey does not perturb a perception that was already operating in default-category mode.

The practical implication is uncomfortable: brands that are less visible in AI training data may be more stable in comparator-paired evaluations, because they are operating in an under-anchored regime where the comparator does not have enough discourse density to perturb the default. Brands that are more visible — that have explicit positioning in AI training data — may be more vulnerable to comparator-induced collapse, because the AI has enough material to redistribute when a new comparator is named. Visibility is not protective. In some regimes, visibility makes you perturbable.

This aligns with another finding from the same R15 dataset. The Mongolian-language follow-up test on APU Chinggis showed that re-prompting in Mongolian instead of English reduced collapse for every model in the panel — average reduction 4.9 DCI points across eight models. APU Chinggis is the opposite of VkusVill on the visibility axis: thin English-language discourse, relatively rich Mongolian-language discourse. When we shift the prompt language to access the richer discourse layer, dimensional content the foreign context could not reach becomes available. Same mechanism. Different direction. Both confirm that AI brand perception is governed by which discourse layer the prompt activates, not by which competitor is on paper the closer match.

What practitioners should actually do

The naive prescription “pair your brand with the correct competitor” does not work in the comparator-sensitive regime. The corrected prescription has three components, in priority order.

First, write a Brand Function specification with explicit dimensional content on the dimensions you want to protect. Not relative positioning (”we are closer to X than to Y”), but first-order claims about what your brand actually is on each dimension. The specification rescues the dimensions where it is concrete; it does not rescue dimensions it does not address.

Second, if you must include a comparator in the specification, pick one that is weak on the dimensions you want your brand to own. Pairing your brand with a competitor that is strong on Ideological will pull Ideological out of your brand. Pairing your brand with a competitor that is dimensional-flat — generic, unspecified, low-discourse-density on the dimensions you care about — will not.

Third, accept that some dimensions of your brand are not rescuable through prompt-level intervention alone. If the AI’s training corpus does not contain dimensional content about your brand on a particular axis, no prompt and no spec can manufacture that content out of thin air. The fix at that point is to publish enough discourse — long-form content, structured metadata, third-party coverage — that the next training cycle picks up dimensional material the current models cannot access. AI brand perception is downstream of training discourse density. Specifications can patch the immediate evaluation; they cannot substitute for the underlying corpus.

These are smaller and less heroic claims than the original Category Label Trap article made. They are also the claims the data actually supports. The comparator trap is real: the obvious fix to the obvious problem makes the problem worse on the dimension that matters most. The specification rescue is real but partial: it works on the dimensions where it speaks concretely, and only there. The visibility paradox is real: more discourse can make a brand more perturbable, not more stable.

There is a satisfying version of this article in which a clean experimental result confirms the theory and the prescription holds. This is not that article. It is an article about an experimental result that contradicted our prescription, a follow-up that partially recovered the situation, and a refined prescription that is honest about where it works and where it does not. The science is more useful than the slogan.

Methodology: Run 10 (corrective comparators) — 126 API calls, six brand pairs, seven models, three runs per cell. Models: Claude Sonnet 4.6, GPT-4o-mini, Gemini 2.5 Flash, DeepSeek V3, Qwen3 30B (local), Gemma 4 27B (local), YandexGPT 5 Pro. Cost ~$0.30. VkusVill Brand Function specification follow-up — 147 API calls, seven conditions, seven models, three runs per cell, mix of English and Russian-language instructions. Cost ~$0.30. Both experiments append directly to the R15 dataset (run5_crosscultural.jsonl and run10_corrective.jsonl in the public repository). Full methodology, source code, and per-call records: Zharnikov (2026v), “Dimensional Collapse in AI-Mediated Brand Perception: Large Language Models as Metameric Observers.” Pre-print: doi.org/10.5281/zenodo.19422427. Pre-registered hypotheses for the main study; Run 10 and the spec test are exploratory follow-ups conducted after Category Label Trap was drafted.

Spectral Branding

Discussion about this post

Ready for more?