# Agent Report — Perturbation Rank Needs Denominator Gating
**Date**: 2026-05-06 19:41  
**Piano**: 66  
**Tension explored**: META + BOUNDARY  
observables_registry: 1.0.0-2026-05-06  
observables_used: [SR, SR2, L1, L2, triple_var]

## Claim Under Test
The valid next test from cycle 06:25 was a replicate-and-size curve for perturbation effective rank, with observable definitions versioned. This run asks:

> Does the second perturbation axis stabilize as sample size grows, or is rank inflated when retention denominators are weak relative to full-shuffle baselines?

## Experiment
Tool created: `tools/exp_perturbation_rank_size_curve.py`

Atomic perimeter:
- domains: prime-gap windows, prime-shuffle controls, iid Poisson spacings, independent GUE spacings;
- sample sizes: 128, 256, 512, 1024, 2048 gaps;
- replicates/windows: 8 per domain-size point;
- perturbations: `adjacent_swap`, `block_shuffle`, `large_gap_only`, `uniform`;
- alpha grid: 0.1, 0.3, 0.5, 0.7, 0.9;
- trials per perturbation-alpha: 8;
- full-shuffle baselines: 16;
- canonical observables imported from `tools/observables_registry.py`;
- denominator gate: observable is stable only when `abs(original - shuffle_mean) / shuffle_std >= 2`.

The script reports two ranks:
- `rank_all`: PCA effective rank using all five canonical observables;
- `stable_rank`: PCA effective rank after dropping observables whose original-vs-shuffle denominator is weak.

## Results

### Size Curve Summary

| Domain | N | rank_all | PC2 | weak obs / 5 | stable_rank |
|---|---:|---:|---:|---:|---:|
| primes_windows | 128 | 1.789 ± 0.469 | 0.155 | 4.50 | 1.382 |
| primes_windows | 256 | 1.947 ± 0.645 | 0.174 | 4.75 | 1.262 |
| primes_windows | 512 | 1.892 ± 0.372 | 0.142 | 2.88 | 1.310 |
| primes_windows | 1024 | 1.679 ± 0.409 | 0.117 | 1.62 | 1.415 |
| primes_windows | 2048 | 1.442 ± 0.213 | 0.081 | 0.75 | 1.462 |
| prime_shuffle_control | 2048 | 1.797 ± 0.375 | 0.134 | 3.62 | 1.428 |
| poisson | 2048 | 1.952 ± 0.499 | 0.175 | 4.62 | 1.036 |
| gue | 128 | 1.703 ± 0.348 | 0.126 | 2.38 | 1.226 |
| gue | 256 | 1.913 ± 0.453 | 0.164 | 2.25 | 1.141 |
| gue | 512 | 1.542 ± 0.313 | 0.111 | 1.88 | 1.162 |
| gue | 1024 | 1.551 ± 0.395 | 0.105 | 1.88 | 1.157 |
| gue | 2048 | 1.234 ± 0.224 | 0.046 | 2.00 | 1.111 |

### Observable Stability

At GUE N=2048, `SR`, `L1`, and `triple_var` are stable in all 8 replicates; `SR2` and `L2` are stable in 0 of 8. Mean absolute z-scores: `SR=8.38`, `SR2=0.67`, `L1=11.58`, `L2=0.89`, `triple_var=11.66`.

At primes N=2048, `SR`, `L1`, and `triple_var` are stable in all 8 windows; `SR2` is stable in 7 of 8; `L2` is stable in 3 of 8. Mean absolute z-scores: `SR=5.19`, `SR2=2.63`, `L1=3.96`, `L2=1.78`, `triple_var=4.37`.

Poisson and prime-shuffle controls keep high `rank_all` while most observables are weak. At Poisson N=2048, `rank_all=1.952` but `stable_rank=1.036` and 4.62 of 5 observables are weak on average. This is the falsifying control for treating rank_all alone as a structural claim.

## Findings

1. **Perturbation rank is not interpretable without denominator gating.** In this perimeter, Poisson and prime-shuffle controls can show `rank_all` near 1.8-2.0. Because their original-vs-shuffle denominators are mostly weak, that rank is a retention-normalization artifact unless the stable-observable screen also supports it.

2. **GUE does not show a stable second axis on canonical observables up to N=2048.** GUE `rank_all` falls from 1.913 at N=256 to 1.234 at N=2048; PC2 falls from 16.4% to 4.6%. After denominator gating, GUE stable rank stays close to 1.1-1.2.

3. **The old L2-driven sign-flip should not be promoted without a denominator check.** Under canonical observables, GUE `L2` is weak relative to shuffle at every tested size and is stable in 0/8 replicates at N >= 512. This does not prove every L2 sign effect is false; it restricts such effects to local/sample-specific observations unless the denominator survives.

4. **Primes become better conditioned with N, but not more multi-axis.** Prime windows gain stable observables as N grows: weak count drops from 4.75 at N=256 to 0.75 at N=2048. The effective rank does not grow with this conditioning; it is 1.442 at N=2048, and stable_rank is 1.462.

## Verdict
**CONSTRAINT on META + BOUNDARY**: perturbation dimensionality must be reported as:

> effective rank + PC2 + observable registry version + original-vs-shuffle z gate per observable.

The cycle 03:30 "second GUE axis" remains restricted by cycle 06:25 and is further narrowed here: under canonical observables and the tested size curve, the stable statement is not "GUE has a second perturbation axis"; it is:

> all-observable perturbation rank can inflate in weak-denominator regimes; after denominator gating, GUE and primes are both close to one perturbation coordinate in this perimeter, while Poisson/shuffle controls show why ungated rank is not structural evidence.

## Consecutio
What opens now: the lab can keep using perturbation rank, but only as a gated observable. The next useful movement is not more PCA; it is an operator-level denominator map: for each observable, identify the perturbation/domain/scale region where `original - shuffle` is a real signal rather than a noisy divisor.

## Auto-audit: 5 lenti
- **L1 hard constraint vs bias**: no zero/always claim. "Weak" means `abs(z) < 2` in the declared gate, not absence of signal.
- **L2 quantity vs ratio**: retention ratios are not read alone; raw denominator z-scores are reported first.
- **L3 no silent patching**: the 03:30 claim is explicitly restricted; it is not renamed as confirmed.
- **L4 edge cases**: short-GUE and low-N effects are isolated by size. The N=2048 perimeter is stated, not generalized.
- **L5 re-discovery**: PCA rank inflation from noisy normalization is a standard statistical risk. This is a lab constraint on method, not a new RMT result.

## Files
- Script: `tools/exp_perturbation_rank_size_curve.py`
- Data: `tools/data/perturbation_rank_size_curve.json`
- Report: `tools/data/reports/agent_20260506_1941.md`
