# Receipts: Specificity Is Cross-Generator (at density)

The content-specificity effect is not a quirk of one model. The clean 2x2
specificity experiment, rerun on a second generator family, produces a
virtually identical effect size at density.

| Generator | specificity effect at density | source |
|:----------|:------------------------------|:-------|
| xAI (grok-4-1-fast) | Hedges g **1.651** | [catching-your-own-overclaim](/receipts/catching-your-own-overclaim) |
| Gemini Flash (gemini-3-flash-preview) | Cohen d **1.669** / Hedges g **1.636** | this kit |

Same design both times: a 2x2 (specificity present/absent, quality demands
present/absent), 10 runs per cell, the same Northvane strategic-analysis task,
density normalization instead of a length cap.

## The one thing that matters here: "at density"

Raw scores do not show this cleanly. On Gemini Flash the raw specificity effect
is only d=0.67, because quality demands produce longer outputs and inflate the
raw marker count (the same length confound the xAI experiment hit). Normalizing
to markers-per-1k-words removes the confound and the specificity effect lands at
d=1.67. So the cross-generator claim is specifically: **specificity at density
is cross-generator.** It is not a raw-score claim.

## Scope

- Cross-generator means **xAI and Gemini Flash**. Gemini Pro was inconclusive
  (outputs truncated to about 60 words), not a confirming null.
- This kit holds the Gemini Flash side (40 raw outputs, the computed analysis).
  The xAI side is its own published receipt, linked above.

## Recompute

```
python3 script.py
```

Expect Gemini Flash specificity at density: Cohen d 1.67, Hedges g 1.64.

## Limits

- 10 runs per cell (40 outputs), one generator per kit. Directional, CIs exclude
  zero but are wide.
- Programmatic marker scoring, not a domain-expert quality judgment. The
  companion xAI receipt records that a blind domain expert could not distinguish
  specific from generic outputs on quality, only on verifiable form.
- One task, one domain (the fictional Northvane scenario). March 2026 models.
