Verifiability

Falsifiable model claims for the synthetic stress dataset.

Synthetic stress scenarios are evaluated against their intended statistical properties. The snapshot below shows where model behavior aligns with — and where it deviates from — the operational claims published in the methodology.

datasets evaluated

50 total

falsifiable claims

off-spec (transparent)

License

CC-BY 4.0

Snapshot generated

2026-05-05

Raw JSON

/verifiability_snapshot.json

Replicas per dataset

Aggregated p95 and tail metrics are sample-sensitive at n=50; treat them as indicative rather than precise.

Off-band calibrations & methodology limits

The following profile-asset combinations sit outside their methodology-expected conformance bands as of the most recent snapshot. We list them here rather than quietly retrying calibration loops, because the deviations carry methodological information rather than being pure simulator deficiencies.

Two distinct phenomena, addressed honestly

Our operational gating definitions (SHARP_CRASH: ≥20% rolling-30d DD AND ≥1.5× crash-window vol; VOL_EXPANSION: median ≥1.5× baseline AND ≥2 distinct windows ≥1.5×) are intentionally asset-uniform. The off-band findings reveal two distinct phenomena that we report separately rather than collapsing into a single “asymmetry” narrative:

Identifiability limits. For BTC, the methodology thresholds are not separating in practice — even the 4 historical BTC stress anchors (Crypto Winter 2018, Luna 2022, FTX 2022, ATH 2017) do not satisfy VOL_EXPANSION empirically (see verification table below). The relative-baseline definition, applied to an asset whose baseline is already 0.80 annualised, fails to single out a separable regime: real BTC stress events cluster around 1.0× baseline in 6-month statistics. This is a definition-level limit, not a simulator calibration failure.
Conditional asymmetry. For other off-band combinations (e.g. WTI low_vol_grind), the simulator produces conditional distributions that do meet the gating definition partially, but at lower-than-expected frequency under fixed methodology thresholds. This is a calibration trade-off rather than a definitional limit.

Why we keep the thresholds uniform. Industry risk frameworks vary in approach: Sharpe / Sortino normalise by asset vol; Basel uses asset-class risk weights; RiskMetrics applies asset-specific EWMA decays. Asset-relative thresholds for our failure-mode classifier would give us higher conformance rates per asset — but at the cost of definitional invariance. We prioritise definitional invariance over calibration optimality so that the FM-bucket assignment of any individual replica is independent of the asset on which it was generated. The trade-off is conscious and disclosed; it is not an unavoidable property of the methodology.

A complementary diagnostic on the roadmap: we plan to add a percentile-of-own-vol annotation alongside the uniform classification (e.g. “Crypto Winter realized vol = 80th percentile of historical BTC 6-month windows”), so users can see both the asset-uniform classification and the asset-relative severity. This is not a redefinition of the score — it is a second column of information.

Empirical identifiability verification (BTC)

The strongest test of an FM definition is whether known historical stress events themselves satisfy it. Below: realized vol of BTC's 4 empirical stress anchors, under the same metric pipeline used for synthetic replicas.

BTC empirical case	Total return	Max DD	Realized vol	× baseline	VOL_EXPANSION-met
BTC ATH 2017	+216%	−35%	0.92	1.14×	no
Crypto Winter 2018	−53%	−66%	0.83	1.03×	no
Luna 2022	−51%	−55%	0.60	0.74×	no
FTX 2022	+28%	−26%	0.45	0.56×	no

Reading. All four BTC stress anchors fall below the 1.5× baseline threshold (1.20 absolute) required for VOL_EXPANSION classification, even though their absolute realized vols (0.45–0.92 annualised) and drawdowns (−26% to −66%) are unambiguously stress-grade. The methodology definition simply does not separate “BTC stress” from “BTC normal” given an 0.80 baseline. Our synthetic VOL_EXPANSION × BTC probe at 0% conformance is therefore a downstream symptom of the definitional limit, not a primary failure of the simulator.

Structural off-band combinations (5)

Conformance falls outside the expected band even when accounting for sampling uncertainty (Wilson 95% CI strictly below band). These are not closed by further calibration loops at fixed methodology thresholds.

Profile × Asset	Primary FM	Conformance (k/n)	Wilson 95% CI	Expected	Source of mismatch
`vol_expansion_setup_synthetic × BTC`	VOL_EXPANSION	0% (0/50)	[0.00, 0.07]	60–90%	Definitional limit (see verification table); BTC empirical anchors themselves do not satisfy the gating
`sharp_crash_setup_synthetic × BTC`	SHARP_CRASH	0% (0/50)	[0.00, 0.07]	20–50%	Same identifiability issue: only 1 of 4 BTC empirical anchors meets gating; threshold is not separating
`liquidity_stress_setup_synthetic × BTC`	LIQUIDITY_STRESS	0% (0/50)	[0.00, 0.07]	60–90%	Identifiability — 0 of 4 BTC empirical anchors meets the conjunction of vol+DD+window criteria
`low_vol_grind × WTI`	VOL_COMPRESSION	10% (5/50)	[0.04, 0.21]	60–90%	Calibration limit (not identifiability): WTI carries structural regime-shift risk; sustained 12-mo low-vol persistence is rare in any conditioning
`low_vol_grind × QQQ`	VOL_COMPRESSION	46% (23/50)	[0.33, 0.60]	60–90%	Calibration limit: QQQ tech-vol baseline is volatile; sustained 0.4–0.7×-baseline regime difficult to enforce. Wilson upper bound just touches band

Implication for robustness scores: These BTC and WTI/QQQ synthetic stress probes contribute to the spectrum-coverage of their failure-mode buckets but with reduced weight (per-bucket shrinkage when n_in_bucket < 10). The empirical historical anchors (Luna 2022, FTX 2022, BTC Crypto Winter 2018; WTI Saudi-War 2014, Negative-Pricing 2020) carry the primary information for these failure modes on these assets — the synthetic probes serve as auxiliary spectrum-fillers, not as primary evidence. For BTC specifically, this is forced by definitional limits of the FM gating (above), not by simulator deficiency.

Sampling-marginal off-band combinations (4)

For these combinations the Wilson 95% CI overlaps the expected band — we cannot statistically distinguish them from in-band conformance at n=50. Note that replicas within a profile-asset combination are not strictly i.i.d. (they share antecedent-phase parameters, calibration template, and observer biases), so the effective sample size is somewhat below 50 and the true CI is likely wider than reported. The Wilson bound here is therefore a lower bound on uncertainty.

Profile × Asset	Primary FM	Conformance (k/n)	Wilson 95% CI	Expected	CI & band
`vol_expansion_setup_synthetic × QQQ`	VOL_EXPANSION	58% (29/50)	[0.44, 0.71]	60–90%	CI overlaps band
`whipsaw_synthetic × BTC`	WHIPSAW	46% (23/50)	[0.33, 0.60]	50–80%	CI overlaps band
`whipsaw_synthetic × ETH`	WHIPSAW	44% (22/50)	[0.31, 0.58]	50–80%	CI overlaps band
`demand_destruction × WTI`	TREND_DOWN	32% (16/50)	[0.21, 0.46]	40–80%	CI overlaps band

Resolution path

BTC × 3 setup-profiles (definitional limit): we do not retry calibration. The empirical-identifiability table above shows the FM gating is not separating for BTC at the uniform threshold. For BTC the trust-layer relies primarily on empirical historical anchors (Crypto Winter, Luna, FTX); synthetic probes provide auxiliary spectrum-coverage with reduced weight via shrinkage.
WTI low_vol_grind + QQQ low_vol_grind (calibration limit): no further re-tuning at fixed thresholds. These probes contribute partial spectrum-coverage; empirical low-vol periods (SPY 2017, GOLD 2014) carry the primary VOL_COMPRESSION information.
4 sampling-marginal cases: re-evaluation on a future n=200 generation pass. If the Wilson CI tightens and no longer overlaps the expected band, we will reclassify as structural.
Methodology revisions: only after anchor validation against historical stress events. Asset-relative threshold tuning is rejected because it would break definitional invariance, not because it would break comparability per se. The complementary percentile-of-own-vol diagnostic is on the roadmap as an additive disclosure, not a redefinition.

Edge case: unclassified replicas

In our current per-FM-bucket score (V2), replicas that fail every FM gating definition (sub-threshold) are excluded from the composite. This means the composite is conditioned on “identifiable regime” rather than on the full outcome distribution. We disclose this as a known bias toward stress-relevant outcomes; the planned correction is to assign sub-threshold replicas to a BASELINE bucket so they enter the composite at neutral weight. This is tracked alongside the V2 production activation work.

Methodology limitations & open issues

Identifiability per asset. The FM gating definitions are not guaranteed to be separating across all assets (BTC is the demonstrated case). Identifiability is verified case-by-case via the empirical-anchor table.
Power of test at n=50. With p̂ ≈ 0.5, the Wilson 95% CI half-width is ~14pp. Small effect sizes (≤14pp from band) are not statistically separable from sampling noise. Larger n is needed to resolve marginal cases definitively.
Multiple testing. 32 datasets × 1–3 FM tags ≈ 60–80 implicit hypothesis tests. We do not apply multiplicity correction to the conformance bands. At α=0.05 nominal one would expect 3–4 false-positive off-band findings under no-effect; we report 9 off-band, of which 5 are structural and 4 sampling-marginal — roughly consistent with the expected false-positive rate among the marginal subset.
Window-mixing in 6-month case slices. Stress events concentrated within a sub-window of the case (e.g. Volmageddon 2018 was a 1-day VIX spike inside a 6-month case window) get diluted in aggregate statistics. We chose 6-month windows for trading-strategy realism, not for FM identifiability — this is a conscious trade-off.
CoT calibration drift. Per-asset observer weights and biases are derived from CFTC Commitment-of-Traders reports over 2006–2026. Calibration is point-in-time; we do not currently track whether the simulator's behaviour drifts as we add later CoT data.
Anchor selection bias. The 31 empirical historical anchors were selected for stress severity and event-type coverage, not by random sampling of all 6-month windows. Selection favours canonical events (Lehman, Dotcom, COVID) over the full distribution of stress regimes that history actually produced.
No out-of-sample walk-forward validation. Robustness scores are computed in-sample relative to the curated case catalog. Strategies are not validated on held-out post-catalog periods. This is on the roadmap but not yet implemented.
Replica non-independence. Within a profile-asset combination, the 50 replicas share antecedent-phase parameters and calibration templates. Effective sample size is below 50 in any analysis that assumes i.i.d. replicas. Wilson CIs reported here are lower bounds on true uncertainty.
Score definition-dependence. Any robustness score that aggregates over FM-classified replicas inherits a structural dependence on the FM definitions themselves: refining a definition (e.g. splitting SHARP_CRASH into mild and deep variants) can shift a strategy's score even if its trading behaviour is unchanged. This is a closed-loop risk for any FM-bucketed metric. The roadmap addresses it via two changes: (i) a two-step aggregation that conserves run-level mass before projecting onto FM-buckets, and (ii) a definition-independent score component (raw drawdown distribution across all replicas, no FM-bucketing) that anchors the composite to a non-classifier-derived quantity. Currently the production composite uses the v1 path (no FM-bucketing in composite), which sidesteps the issue at the cost of not exposing per-FM-bucket detail in the score itself; the per-FM-bucket detail is reported separately in the regime-performance breakdown.

Per-dataset claim validation

Each card lists the synthetic profile, the failure-mode it claims to represent, and the result of comparing the claimed operational properties against the median across 50 Monte Carlo replicas. Expand a card for full claim text, measured values, and aggregated metric distributions.

Demand DestructionWTITREND_DOWN + VOL_EXPANSION

2 of 2 claims off-spec·50 replicas·126d stress + 250d pre-period

Imposed conditions for sustained downward trajectory with vol expansion (oil-specific).

Claim validation

Total return over case (TREND_DOWN aspect)off-spec

Claim: deep decline, methodology −25% to −60%

Measured median: -0.1071

Measured median (-0.1071) is above the claimed upper bound (-0.2000). The simulator does not produce regime characteristics within the methodology operational band for this profile-asset combination — calibration limit. Resolution path: re-tune the profile (preferred) or revise the regime definition (only after anchor validation).

Realized volatility (VOL_EXPANSION aspect)off-spec

Claim: ≥ 1.5× WTI baseline (~0.45); methodology gating

Measured median: 0.2821

Measured median (0.2821) is below the claimed lower bound (0.4500). The simulator does not produce regime characteristics within the methodology operational band for this profile-asset combination — calibration limit. Resolution path: re-tune the profile (preferred) or revise the regime definition (only after anchor validation).

Aggregated metrics across 50 replicas

Metric	p5	p25	median	p75	p95
total_return	-0.3638	-0.2664	-0.1071	0.0271	0.1907
realized_vol_annualized	0.2484	0.2676	0.2821	0.3114	0.3574
max_drawdown	-0.4131	-0.3498	-0.2299	-0.1827	-0.1228
autocorrelation_lag1	-0.1587	-0.0668	0.0208	0.0651	0.1460
kurtosis	-0.5368	-0.3600	-0.1769	0.1396	0.5214
skewness	-0.2876	-0.1341	-0.0188	0.1147	0.2986
tail_p1	-0.0542	-0.0463	-0.0414	-0.0376	-0.0332
tail_p99	0.0310	0.0333	0.0386	0.0435	0.0510
sign_change_frequency	0.4274	0.4677	0.5000	0.5242	0.5573
vol_of_vol	0.0254	0.0361	0.0420	0.0506	0.0617
avg_run_length	1.7835	1.8939	1.9841	2.1186	2.3148
rolling_30d_max_dd	-0.2610	-0.2182	-0.1821	-0.1465	-0.1160
crash_window_vol	0.2106	0.2545	0.2853	0.3242	0.3823
retracement_from_trough	0.0000	0.0317	0.1540	0.4467	1.1719
sign_changes_5pct_count	4.0000	6.2500	9.0000	10.0000	12.0000

Source file: demand_destruction_wti.json

Pre-period regime: demand_weakness

Sample seeds (first 5): 7308, 7309, 7310, 7311, 7312

Hyperinflation BoostGOLDTREND_UP + VOL_EXPANSION

1 of 2 claim off-spec·50 replicas·126d stress + 250d pre-period

Imposed conditions for persistent upward advance with elevated volatility (gold-specific).

Claim validation

Total return over case (TREND_UP aspect)in-band

Claim: sustained advance, methodology +8% to +25% (annualized 4-6mo)

Measured median: 0.1376

within claimed range

Realized volatility (VOL_EXPANSION aspect)off-spec

Claim: ≥ 1.5× GOLD baseline (~0.20); methodology gating

Measured median: 0.1449

Measured median (0.1449) is below the claimed lower bound (0.1950). The simulator does not produce regime characteristics within the methodology operational band for this profile-asset combination — calibration limit. Resolution path: re-tune the profile (preferred) or revise the regime definition (only after anchor validation).

Aggregated metrics across 50 replicas

Metric	p5	p25	median	p75	p95
total_return	-0.0281	0.0549	0.1376	0.2196	0.3741
realized_vol_annualized	0.1253	0.1354	0.1449	0.1575	0.1752
max_drawdown	-0.1421	-0.0935	-0.0726	-0.0568	-0.0365
autocorrelation_lag1	-0.1761	-0.0852	-0.0294	0.0307	0.1078
kurtosis	-0.5930	-0.3644	-0.0940	0.2122	0.8336
skewness	-0.3112	-0.1356	0.0143	0.1949	0.3683
tail_p1	-0.0250	-0.0215	-0.0191	-0.0168	-0.0140
tail_p99	0.0174	0.0195	0.0210	0.0230	0.0254
sign_change_frequency	0.4194	0.4758	0.5202	0.5403	0.5976
vol_of_vol	0.0136	0.0172	0.0203	0.0245	0.0283
avg_run_length	1.6647	1.8382	1.9085	2.0833	2.3585
rolling_30d_max_dd	-0.1111	-0.0830	-0.0670	-0.0566	-0.0365
crash_window_vol	0.1157	0.1300	0.1410	0.1669	0.1885
retracement_from_trough	0.0109	0.3680	1.0731	2.3618	4.3028
sign_changes_5pct_count	1.0000	2.0000	2.5000	3.0000	5.0000

Source file: hyperinflation_boost_gold.json

Pre-period regime: range

Sample seeds (first 5): 6092, 6093, 6094, 6095, 6096

Liquidity Stress Setup SyntheticBTC

all claims in-band·50 replicas·126d stress + 250d pre-period

Claim validation

Aggregated metrics across 50 replicas

Metric	p5	p25	median	p75	p95
total_return	-0.4457	-0.2135	0.0765	0.2747	0.6941
realized_vol_annualized	0.3463	0.3699	0.4183	0.4590	0.5187
max_drawdown	-0.5649	-0.3470	-0.2403	-0.1785	-0.1267
autocorrelation_lag1	-0.1435	-0.0663	-0.0056	0.0537	0.1960
kurtosis	-0.6070	-0.2616	0.0175	0.2046	0.8475
skewness	-0.3902	-0.1329	0.0023	0.1455	0.4234
tail_p1	-0.0758	-0.0620	-0.0574	-0.0517	-0.0443
tail_p99	0.0457	0.0509	0.0583	0.0655	0.0746
sign_change_frequency	0.4194	0.4597	0.4960	0.5242	0.5815
vol_of_vol	0.0377	0.0465	0.0557	0.0660	0.0827
avg_run_length	1.7103	1.8939	2.0001	2.1552	2.3585
rolling_30d_max_dd	-0.3826	-0.2596	-0.2131	-0.1752	-0.1267
crash_window_vol	0.3187	0.3628	0.3997	0.4660	0.5181
retracement_from_trough	0.0000	0.0718	0.4598	0.8969	2.7221
sign_changes_5pct_count	8.4500	11.0000	13.0000	15.0000	19.1000

Source file: liquidity_stress_setup_synthetic_btc.json

Pre-period regime: high_vol_sideways

Sample seeds (first 5): 20800, 20801, 20802, 20803, 20804

Liquidity Stress Setup SyntheticSPY

all claims in-band·50 replicas·126d stress + 250d pre-period

Claim validation

Aggregated metrics across 50 replicas

Metric	p5	p25	median	p75	p95
total_return	-0.4537	-0.2169	-0.0170	0.1149	0.7288
realized_vol_annualized	0.3393	0.3785	0.4216	0.4529	0.4975
max_drawdown	-0.5101	-0.3765	-0.2697	-0.1981	-0.1468
autocorrelation_lag1	-0.1324	-0.0690	-0.0188	0.0392	0.1063
kurtosis	-0.4738	-0.3443	-0.0224	0.1995	0.5106
skewness	-0.3453	-0.1723	-0.0901	0.0740	0.1730
tail_p1	-0.0726	-0.0652	-0.0588	-0.0524	-0.0461
tail_p99	0.0409	0.0493	0.0561	0.0629	0.0704
sign_change_frequency	0.4355	0.4778	0.5040	0.5323	0.5770
vol_of_vol	0.0362	0.0463	0.0568	0.0644	0.0815
avg_run_length	1.7230	1.8657	1.9686	2.0748	2.2727
rolling_30d_max_dd	-0.3223	-0.2847	-0.2367	-0.1851	-0.1465
crash_window_vol	0.3195	0.3540	0.4088	0.4612	0.5454
retracement_from_trough	0.0006	0.0901	0.3839	0.7481	2.7293
sign_changes_5pct_count	7.0000	12.0000	13.0000	15.0000	18.5500

Source file: liquidity_stress_setup_synthetic_spy.json

Pre-period regime: bull_trend

Sample seeds (first 5): 20700, 20701, 20702, 20703, 20704

Liquidity Stress Setup SyntheticWTI

all claims in-band·50 replicas·126d stress + 250d pre-period

Claim validation

Aggregated metrics across 50 replicas

Metric	p5	p25	median	p75	p95
total_return	-0.4856	-0.3045	-0.1496	0.1841	0.6619
realized_vol_annualized	0.4481	0.5011	0.5418	0.5936	0.6440
max_drawdown	-0.5764	-0.4572	-0.3729	-0.2658	-0.1819
autocorrelation_lag1	-0.1613	-0.1107	-0.0287	0.0477	0.1291
kurtosis	-0.5205	-0.2694	-0.1025	0.2244	0.8505
skewness	-0.4631	-0.1165	-0.0045	0.1169	0.3580
tail_p1	-0.0979	-0.0861	-0.0760	-0.0679	-0.0600
tail_p99	0.0563	0.0655	0.0743	0.0867	0.0988
sign_change_frequency	0.4274	0.4758	0.5081	0.5403	0.6048
vol_of_vol	0.0509	0.0607	0.0746	0.0897	0.1112
avg_run_length	1.6447	1.8382	1.9536	2.0833	2.3148
rolling_30d_max_dd	-0.4249	-0.3471	-0.2941	-0.2413	-0.1819
crash_window_vol	0.3735	0.4743	0.5355	0.5849	0.6485
retracement_from_trough	0.0000	0.0866	0.2392	0.7800	1.4794
sign_changes_5pct_count	10.9000	16.0000	19.0000	21.0000	25.0000

Source file: liquidity_stress_setup_synthetic_wti.json

Pre-period regime: range_high_vol

Sample seeds (first 5): 22200, 22201, 22202, 22203, 22204

Low Vol GrindGOLDVOL_COMPRESSION

all claims in-band·50 replicas·252d stress + 250d pre-period

Sustained suppression of realized volatility over multiple months.

Claim validation

Realized volatility (annualized)in-band

Claim: 0.4–0.7× GOLD baseline (~0.05–0.09)

Measured median: 0.0709

within claimed range

Maximum drawdown (median)in-band

Claim: shallow drawdowns < 8% (non-VIX) per methodology

Measured median: -0.0508

within claimed range

Aggregated metrics across 50 replicas

Metric	p5	p25	median	p75	p95
total_return	-0.0504	0.0137	0.0963	0.1298	0.1969
realized_vol_annualized	0.0583	0.0647	0.0709	0.0763	0.0846
max_drawdown	-0.1040	-0.0659	-0.0508	-0.0408	-0.0300
autocorrelation_lag1	-0.0898	-0.0404	0.0088	0.0534	0.0976
kurtosis	-0.3890	-0.2088	-0.0824	0.0638	0.3299
skewness	-0.2809	-0.1231	-0.0003	0.0813	0.1564
tail_p1	-0.0123	-0.0107	-0.0100	-0.0089	-0.0080
tail_p99	0.0083	0.0094	0.0102	0.0111	0.0127
sign_change_frequency	0.4378	0.4650	0.4920	0.5150	0.5400
vol_of_vol	0.0073	0.0089	0.0105	0.0121	0.0140
avg_run_length	1.8456	1.9345	2.0242	2.1408	2.2726
rolling_30d_max_dd	-0.0684	-0.0491	-0.0429	-0.0359	-0.0284
crash_window_vol	0.0510	0.0625	0.0712	0.0795	0.0870
retracement_from_trough	0.0759	0.2629	1.0817	2.0360	3.2660
sign_changes_5pct_count	0.0000	1.0000	1.0000	2.0000	3.0000

Source file: low_vol_grind_gold.json

Pre-period regime: range

Sample seeds (first 5): 1884, 1885, 1886, 1887, 1888

Low Vol GrindQQQVOL_COMPRESSION

all claims in-band·50 replicas·252d stress + 250d pre-period

Sustained suppression of realized volatility over multiple months.

Claim validation

Realized volatility (annualized)in-band

Claim: 0.4–0.7× QQQ baseline (~0.08–0.15)

Measured median: 0.0954

within claimed range

Maximum drawdown (median)in-band

Claim: shallow drawdowns < 8% (non-VIX) per methodology

Measured median: -0.0731

within claimed range

Aggregated metrics across 50 replicas

Metric	p5	p25	median	p75	p95
total_return	-0.1133	-0.0053	0.0957	0.2044	0.2793
realized_vol_annualized	0.0802	0.0866	0.0954	0.1028	0.1085
max_drawdown	-0.1488	-0.1037	-0.0731	-0.0488	-0.0379
autocorrelation_lag1	-0.1286	-0.0463	-0.0128	0.0135	0.1088
kurtosis	-0.4538	-0.2102	-0.0254	0.1077	0.5295
skewness	-0.2475	-0.1046	0.0043	0.1078	0.2091
tail_p1	-0.0157	-0.0149	-0.0129	-0.0120	-0.0103
tail_p99	0.0113	0.0128	0.0138	0.0151	0.0172
sign_change_frequency	0.4560	0.4760	0.5000	0.5230	0.5502
vol_of_vol	0.0091	0.0124	0.0138	0.0159	0.0182
avg_run_length	1.8116	1.9051	1.9921	2.0917	2.1826
rolling_30d_max_dd	-0.0901	-0.0766	-0.0575	-0.0469	-0.0357
crash_window_vol	0.0661	0.0827	0.0958	0.1053	0.1163
retracement_from_trough	0.0319	0.1755	0.7892	2.0897	4.6784
sign_changes_5pct_count	1.0000	1.0000	2.0000	3.0000	5.0000

Source file: low_vol_grind_qqq.json

Pre-period regime: bull_trend

Sample seeds (first 5): 21700, 21701, 21702, 21703, 21704

Low Vol GrindSPYVOL_COMPRESSION

all claims in-band·50 replicas·252d stress + 250d pre-period

Sustained suppression of realized volatility over multiple months.

Claim validation

Realized volatility (annualized)in-band

Claim: 0.4–0.7× SPY baseline (~0.06–0.11)

Measured median: 0.0724

within claimed range

Maximum drawdown (median)in-band

Claim: shallow drawdowns < 8% (non-VIX) per methodology

Measured median: -0.0476

within claimed range

Aggregated metrics across 50 replicas

Metric	p5	p25	median	p75	p95
total_return	-0.0551	0.0601	0.1489	0.2159	0.3480
realized_vol_annualized	0.0589	0.0667	0.0724	0.0798	0.0851
max_drawdown	-0.1120	-0.0635	-0.0476	-0.0360	-0.0229
autocorrelation_lag1	-0.1053	-0.0474	-0.0169	0.0161	0.0807
kurtosis	-0.4547	-0.2024	-0.0385	0.1492	0.7953
skewness	-0.2613	-0.1237	-0.0252	0.1023	0.2461
tail_p1	-0.0119	-0.0109	-0.0097	-0.0086	-0.0078
tail_p99	0.0080	0.0097	0.0112	0.0121	0.0130
sign_change_frequency	0.4698	0.4770	0.5040	0.5280	0.5524
vol_of_vol	0.0081	0.0098	0.0106	0.0119	0.0145
avg_run_length	1.8046	1.8872	1.9764	2.0873	2.1191
rolling_30d_max_dd	-0.0701	-0.0547	-0.0421	-0.0329	-0.0229
crash_window_vol	0.0527	0.0624	0.0748	0.0827	0.0938
retracement_from_trough	0.0253	0.2950	1.5435	3.6094	11.1908
sign_changes_5pct_count	1.0000	1.0000	1.0000	2.0000	3.0000

Source file: low_vol_grind_spy.json

Pre-period regime: bull_trend

Sample seeds (first 5): 1805, 1806, 1807, 1808, 1809

Low Vol GrindWTIVOL_COMPRESSION

1 of 2 claim off-spec·50 replicas·252d stress + 250d pre-period

Sustained suppression of realized volatility over multiple months.

Claim validation

Realized volatility (annualized)in-band

Claim: 0.4–0.7× WTI baseline (~0.12–0.21)

Measured median: 0.1209

within claimed range

Maximum drawdown (median)off-spec

Claim: shallow drawdowns < 8% (non-VIX) per methodology

Measured median: -0.0932

Measured median (-0.0932) is below the claimed lower bound (-0.0800). The simulator does not produce regime characteristics within the methodology operational band for this profile-asset combination — calibration limit. Resolution path: re-tune the profile (preferred) or revise the regime definition (only after anchor validation).

Aggregated metrics across 50 replicas

Metric	p5	p25	median	p75	p95
total_return	-0.1567	0.0316	0.1442	0.2262	0.4273
realized_vol_annualized	0.1015	0.1090	0.1209	0.1331	0.1457
max_drawdown	-0.2070	-0.1256	-0.0932	-0.0753	-0.0499
autocorrelation_lag1	-0.0906	-0.0437	0.0022	0.0474	0.0985
kurtosis	-0.3226	-0.1610	-0.0539	0.2060	0.5123
skewness	-0.2224	-0.0860	0.0138	0.1050	0.2223
tail_p1	-0.0208	-0.0192	-0.0168	-0.0153	-0.0140
tail_p99	0.0137	0.0156	0.0179	0.0198	0.0226
sign_change_frequency	0.4538	0.4800	0.5000	0.5200	0.5480
vol_of_vol	0.0128	0.0158	0.0180	0.0203	0.0242
avg_run_length	1.8188	1.9160	1.9921	2.0744	2.1931
rolling_30d_max_dd	-0.1053	-0.0883	-0.0798	-0.0665	-0.0471
crash_window_vol	0.0851	0.1027	0.1185	0.1419	0.1529
retracement_from_trough	0.0273	0.2638	0.9685	2.0036	4.8065
sign_changes_5pct_count	1.0000	3.0000	5.0000	5.0000	6.0000

Source file: low_vol_grind_wti.json

Pre-period regime: demand_weakness

Sample seeds (first 5): 21800, 21801, 21802, 21803, 21804

Sharp Crash Setup SyntheticBTCSHARP_CRASH (conditional)

1 of 2 claim off-spec·50 replicas·126d stress + 250d pre-period

Imposed conditions for an institutional risk-off shock — agent-based simulator produces a spectrum of outcomes; per-replica conformance to SHARP_CRASH gating (≥20% rolling-30d DD AND ≥1.5× crash-window vol) varies in the 20-50% band per methodology expectations.

Claim validation

Realized volatility over case (descriptive)off-spec

Claim: elevated case-wide vol; aggregate proxy for crash-window vol-spike

Measured median: 0.2949

Measured median (0.2949) is below the claimed lower bound (0.8000). The simulator does not produce regime characteristics within the methodology operational band for this profile-asset combination — calibration limit. Resolution path: re-tune the profile (preferred) or revise the regime definition (only after anchor validation). (crypto-asset; relative-to-baseline criterion)

Excess kurtosis (tail-modality descriptive)in-band

Claim: fat-tailed return distribution typical of SHARP_CRASH events; ≥ 1.5 indicates tail-intensified sub-classification

Measured median: 5.9362

within claimed range (crypto-asset; relative-to-baseline criterion)

Aggregated metrics across 50 replicas

Metric	p5	p25	median	p75	p95
total_return	-0.6619	-0.5041	-0.0109	0.1627	0.4066
realized_vol_annualized	0.2248	0.2600	0.2949	0.3216	0.3859
max_drawdown	-0.6877	-0.5595	-0.2641	-0.1426	-0.0921
autocorrelation_lag1	-0.2024	-0.0936	-0.0152	0.1273	0.1994
kurtosis	2.5854	4.4152	5.9362	7.9041	10.3334
skewness	-1.3412	-0.8534	-0.2275	0.9028	1.6424
tail_p1	-0.0906	-0.0699	-0.0578	-0.0488	-0.0381
tail_p99	0.0340	0.0394	0.0556	0.0676	0.0885
sign_change_frequency	0.3181	0.3810	0.4435	0.5000	0.5403
vol_of_vol	0.1177	0.1455	0.1788	0.2019	0.2643
avg_run_length	1.8382	1.9841	2.2321	2.5909	3.0907
rolling_30d_max_dd	-0.4272	-0.3282	-0.2458	-0.1426	-0.0921
crash_window_vol	0.1753	0.3462	0.4513	0.5458	0.6103
retracement_from_trough	0.0000	0.0000	0.2454	0.7919	3.2848
sign_changes_5pct_count	2.0000	4.0000	5.0000	7.0000	8.5500

Source file: sharp_crash_setup_synthetic_btc.json

Pre-period regime: parabolic_bull

Sample seeds (first 5): 20200, 20201, 20202, 20203, 20204

Sharp Crash Setup SyntheticQQQSHARP_CRASH (conditional)

all claims in-band·50 replicas·126d stress + 250d pre-period

Claim validation

Realized volatility over case (descriptive)in-band

Claim: elevated case-wide vol; aggregate proxy for crash-window vol-spike

Measured median: 0.3133

within claimed range

Excess kurtosis (tail-modality descriptive)in-band

Claim: fat-tailed return distribution typical of SHARP_CRASH events; ≥ 1.5 indicates tail-intensified sub-classification

Measured median: 6.2548

within claimed range

Aggregated metrics across 50 replicas

Metric	p5	p25	median	p75	p95
total_return	-0.6899	-0.4724	-0.1776	0.2033	0.5351
realized_vol_annualized	0.2418	0.2800	0.3133	0.3492	0.3967
max_drawdown	-0.7075	-0.4974	-0.2775	-0.1414	-0.0766
autocorrelation_lag1	-0.2526	-0.0697	-0.0119	0.0976	0.2633
kurtosis	3.3019	4.9795	6.2548	7.2740	10.2887
skewness	-1.8674	-1.0467	-0.2210	0.4757	1.4983
tail_p1	-0.0914	-0.0750	-0.0647	-0.0544	-0.0318
tail_p99	0.0264	0.0468	0.0543	0.0667	0.0825
sign_change_frequency	0.3262	0.3810	0.4516	0.5081	0.5484
vol_of_vol	0.1375	0.1674	0.1883	0.2193	0.2594
avg_run_length	1.8116	1.9531	2.1930	2.5909	3.0161
rolling_30d_max_dd	-0.4632	-0.3198	-0.2396	-0.1386	-0.0727
crash_window_vol	0.1251	0.3261	0.4760	0.5637	0.6755
retracement_from_trough	0.0000	0.0000	0.1723	0.7110	2.5357
sign_changes_5pct_count	3.0000	5.0000	6.0000	8.7500	11.0000

Source file: sharp_crash_setup_synthetic_qqq.json

Pre-period regime: sideways_low_vol

Sample seeds (first 5): 20100, 20101, 20102, 20103, 20104

Sharp Crash Setup SyntheticSPYSHARP_CRASH (conditional)

all claims in-band·50 replicas·126d stress + 250d pre-period

Claim validation

Realized volatility over case (descriptive)in-band

Claim: elevated case-wide vol; aggregate proxy for crash-window vol-spike

Measured median: 0.2878

within claimed range

Excess kurtosis (tail-modality descriptive)in-band

Claim: fat-tailed return distribution typical of SHARP_CRASH events; ≥ 1.5 indicates tail-intensified sub-classification

Measured median: 5.7450

within claimed range

Aggregated metrics across 50 replicas

Metric	p5	p25	median	p75	p95
total_return	-0.5786	-0.3213	0.0487	0.2260	0.4576
realized_vol_annualized	0.2151	0.2641	0.2878	0.3195	0.3709
max_drawdown	-0.5994	-0.4070	-0.1733	-0.1287	-0.0758
autocorrelation_lag1	-0.2870	-0.0941	0.0470	0.1566	0.2519
kurtosis	2.9215	4.2095	5.7450	7.2636	11.9259
skewness	-1.4953	-0.5312	-0.0115	0.6529	1.7288
tail_p1	-0.0810	-0.0643	-0.0532	-0.0455	-0.0320
tail_p99	0.0354	0.0462	0.0563	0.0610	0.0738
sign_change_frequency	0.3145	0.4294	0.4758	0.5141	0.5565
vol_of_vol	0.1199	0.1444	0.1794	0.2022	0.2409
avg_run_length	1.7857	1.9306	2.0833	2.3043	3.1250
rolling_30d_max_dd	-0.4193	-0.2842	-0.1716	-0.1278	-0.0758
crash_window_vol	0.1374	0.2977	0.4082	0.5053	0.6430
retracement_from_trough	0.0000	0.0182	0.4204	1.1417	3.5336
sign_changes_5pct_count	3.0000	5.0000	5.0000	7.0000	9.0000

Source file: sharp_crash_setup_synthetic_spy.json

Pre-period regime: bull_trend

Sample seeds (first 5): 20000, 20001, 20002, 20003, 20004

Slow Crash No Recovery SyntheticGOLDSLOW_BEAR + TREND_DOWN (adversarial)

2 of 2 claims off-spec·50 replicas·126d stress + 250d pre-period

Adversarial probe: amplified bear drift + sustained dip-buyer suppression vetoes the rebounds present in real historical slow bears. Tests strategies under conditions strict-historically rare. Expected SLOW_BEAR-conformance 40-80%.

Claim validation

Total return over caseoff-spec

Claim: persistent decline with no rebound, methodology −25% to −60%

Measured median: -0.2158

Measured median (-0.2158) is above the claimed upper bound (-0.2500). The simulator does not produce regime characteristics within the methodology operational band for this profile-asset combination — calibration limit. Resolution path: re-tune the profile (preferred) or revise the regime definition (only after anchor validation).

Maximum drawdown (median)off-spec

Claim: drawdowns deepen monotonically; methodology −25% to −70%

Measured median: -0.2231

Measured median (-0.2231) is above the claimed upper bound (-0.2500). The simulator does not produce regime characteristics within the methodology operational band for this profile-asset combination — calibration limit. Resolution path: re-tune the profile (preferred) or revise the regime definition (only after anchor validation).

Aggregated metrics across 50 replicas

Metric	p5	p25	median	p75	p95
total_return	-0.4264	-0.3362	-0.2158	-0.1213	0.0026
realized_vol_annualized	0.1385	0.1490	0.1573	0.1626	0.1815
max_drawdown	-0.4384	-0.3493	-0.2231	-0.1803	-0.1002
autocorrelation_lag1	-0.1772	-0.0737	-0.0232	0.0538	0.1158
kurtosis	-0.6404	-0.2935	-0.0781	0.0934	0.4456
skewness	-0.2781	-0.1115	0.0108	0.1426	0.2619
tail_p1	-0.0287	-0.0252	-0.0236	-0.0220	-0.0198
tail_p99	0.0155	0.0176	0.0198	0.0215	0.0240
sign_change_frequency	0.3988	0.4677	0.4919	0.5202	0.5565
vol_of_vol	0.0126	0.0191	0.0225	0.0266	0.0306
avg_run_length	1.7857	1.9087	2.0161	2.1186	2.4779
rolling_30d_max_dd	-0.2112	-0.1752	-0.1384	-0.1131	-0.0779
crash_window_vol	0.1184	0.1396	0.1527	0.1647	0.1920
retracement_from_trough	0.0000	0.0069	0.0218	0.0913	0.3853
sign_changes_5pct_count	1.0000	1.0000	2.5000	3.0000	5.0000

Source file: slow_crash_no_recovery_synthetic_gold.json

Pre-period regime: inflationary_upcycle

Sample seeds (first 5): 21600, 21601, 21602, 21603, 21604

Slow Crash No Recovery SyntheticQQQSLOW_BEAR + TREND_DOWN (adversarial)

all claims in-band·50 replicas·126d stress + 250d pre-period

Claim validation

Total return over casein-band

Claim: persistent decline with no rebound, methodology −25% to −60%

Measured median: -0.2469

within claimed range

Maximum drawdown (median)in-band

Claim: drawdowns deepen monotonically; methodology −25% to −70%

Measured median: -0.2634

within claimed range

Aggregated metrics across 50 replicas

Metric	p5	p25	median	p75	p95
total_return	-0.4081	-0.3323	-0.2469	-0.1530	-0.0052
realized_vol_annualized	0.1338	0.1488	0.1589	0.1675	0.1789
max_drawdown	-0.4236	-0.3393	-0.2634	-0.2067	-0.1214
autocorrelation_lag1	-0.1658	-0.0670	0.0092	0.0636	0.1257
kurtosis	-0.6537	-0.3536	-0.0936	0.1508	0.6021
skewness	-0.3088	-0.0832	0.0494	0.2217	0.3738
tail_p1	-0.0280	-0.0257	-0.0237	-0.0210	-0.0194
tail_p99	0.0136	0.0178	0.0194	0.0231	0.0255
sign_change_frequency	0.3871	0.4516	0.4839	0.5060	0.5565
vol_of_vol	0.0126	0.0176	0.0204	0.0264	0.0294
avg_run_length	1.7857	1.9609	2.0492	2.1930	2.5510
rolling_30d_max_dd	-0.2044	-0.1699	-0.1389	-0.1209	-0.0933
crash_window_vol	0.1177	0.1433	0.1591	0.1699	0.1838
retracement_from_trough	0.0000	0.0035	0.0447	0.0923	0.7401
sign_changes_5pct_count	1.0000	1.0000	3.0000	3.0000	5.0000

Source file: slow_crash_no_recovery_synthetic_qqq.json

Pre-period regime: sideways_low_vol

Sample seeds (first 5): 21500, 21501, 21502, 21503, 21504

Slow Crash No Recovery SyntheticSPYSLOW_BEAR + TREND_DOWN (adversarial)

1 of 2 claim off-spec·50 replicas·126d stress + 250d pre-period

Claim validation

Total return over caseoff-spec

Claim: persistent decline with no rebound, methodology −25% to −60%

Measured median: -0.2113

Measured median (-0.2113) is above the claimed upper bound (-0.2500). The simulator does not produce regime characteristics within the methodology operational band for this profile-asset combination — calibration limit. Resolution path: re-tune the profile (preferred) or revise the regime definition (only after anchor validation).

Maximum drawdown (median)in-band

Claim: drawdowns deepen monotonically; methodology −25% to −70%

Measured median: -0.2518

within claimed range

Aggregated metrics across 50 replicas

Metric	p5	p25	median	p75	p95
total_return	-0.4951	-0.3244	-0.2113	-0.1576	-0.0381
realized_vol_annualized	0.1379	0.1464	0.1588	0.1714	0.1826
max_drawdown	-0.5008	-0.3601	-0.2518	-0.1928	-0.0934
autocorrelation_lag1	-0.2217	-0.1024	-0.0068	0.0542	0.1254
kurtosis	-0.5522	-0.3238	-0.1602	0.0745	0.5375
skewness	-0.3151	-0.1329	-0.0050	0.1519	0.4396
tail_p1	-0.0298	-0.0251	-0.0234	-0.0221	-0.0196
tail_p99	0.0160	0.0172	0.0192	0.0210	0.0243
sign_change_frequency	0.3871	0.4315	0.4758	0.5302	0.5726
vol_of_vol	0.0130	0.0187	0.0204	0.0266	0.0323
avg_run_length	1.7361	1.8727	2.0833	2.2942	2.5510
rolling_30d_max_dd	-0.2215	-0.1814	-0.1411	-0.1198	-0.0771
crash_window_vol	0.1267	0.1429	0.1562	0.1737	0.1976
retracement_from_trough	0.0000	0.0000	0.0265	0.1043	0.4142
sign_changes_5pct_count	1.0000	1.0000	3.0000	3.0000	5.0000

Source file: slow_crash_no_recovery_synthetic_spy.json

Pre-period regime: bull_trend

Sample seeds (first 5): 21400, 21401, 21402, 21403, 21404

Slow Decline With Partial RecoveryGOLDSIDEWAYS-with-bearish-bias (mixed FM ex-post)

all claims in-band·50 replicas·126d stress + 250d pre-period

Imposed conditions for moderate bear regime where mid-period rebounds are permitted. Per-replica ex-post FM distribution: primarily SIDEWAYS (42-64%) + WHIPSAW (22-38%) + scattered SLOW_BEAR/TREND_DOWN. The mild bearish drift produces sideways-with-bear-bias rather than full SLOW_BEAR — empirical characterization, not a profile failure.

Claim validation

Total return over casein-band

Claim: moderate decline with mid-period rebounds permitted, typically −5% to −20%

Measured median: -0.1107

within claimed range

Maximum drawdown (median)in-band

Claim: moderate drawdowns, typically −10% to −25%

Measured median: -0.1447

within claimed range

Aggregated metrics across 50 replicas

Metric	p5	p25	median	p75	p95
total_return	-0.2390	-0.1663	-0.1107	-0.0546	0.0623
realized_vol_annualized	0.1177	0.1252	0.1394	0.1494	0.1648
max_drawdown	-0.2564	-0.1861	-0.1447	-0.1060	-0.0631
autocorrelation_lag1	-0.1353	-0.0590	-0.0207	0.0267	0.1264
kurtosis	-0.5161	-0.3360	-0.0948	0.2411	0.7237
skewness	-0.3890	-0.1696	0.0037	0.1697	0.2720
tail_p1	-0.0247	-0.0216	-0.0205	-0.0182	-0.0159
tail_p99	0.0134	0.0171	0.0183	0.0198	0.0239
sign_change_frequency	0.4068	0.4597	0.4919	0.5161	0.5609
vol_of_vol	0.0122	0.0155	0.0198	0.0241	0.0304
avg_run_length	1.7719	1.9231	2.0166	2.1552	2.4298
rolling_30d_max_dd	-0.1502	-0.1132	-0.0946	-0.0820	-0.0567
crash_window_vol	0.0959	0.1126	0.1333	0.1548	0.1824
retracement_from_trough	0.0000	0.0000	0.0733	0.2125	1.3921
sign_changes_5pct_count	1.0000	1.0000	3.0000	3.0000	4.5500

Source file: slow_decline_with_partial_recovery_gold.json

Pre-period regime: demand_weakness

Sample seeds (first 5): 2526, 2527, 2528, 2529, 2530

Slow Decline With Partial RecoveryQQQSIDEWAYS-with-bearish-bias (mixed FM ex-post)

all claims in-band·50 replicas·126d stress + 250d pre-period

Claim validation

Total return over casein-band

Claim: moderate decline with mid-period rebounds permitted, typically −5% to −20%

Measured median: -0.0790

within claimed range

Maximum drawdown (median)in-band

Claim: moderate drawdowns, typically −10% to −25%

Measured median: -0.1326

within claimed range

Aggregated metrics across 50 replicas

Metric	p5	p25	median	p75	p95
total_return	-0.2068	-0.1428	-0.0790	0.0055	0.0961
realized_vol_annualized	0.1192	0.1268	0.1381	0.1448	0.1560
max_drawdown	-0.2267	-0.1862	-0.1326	-0.0825	-0.0651
autocorrelation_lag1	-0.1394	-0.0526	-0.0015	0.0400	0.1528
kurtosis	-0.5297	-0.2771	-0.0300	0.3043	0.7323
skewness	-0.2455	-0.0917	-0.0016	0.1519	0.2956
tail_p1	-0.0238	-0.0205	-0.0193	-0.0176	-0.0157
tail_p99	0.0150	0.0168	0.0181	0.0196	0.0239
sign_change_frequency	0.4391	0.4677	0.5000	0.5323	0.5887
vol_of_vol	0.0111	0.0150	0.0190	0.0224	0.0260
avg_run_length	1.6892	1.8657	1.9841	2.1186	2.2545
rolling_30d_max_dd	-0.1420	-0.1128	-0.0954	-0.0727	-0.0617
crash_window_vol	0.1109	0.1224	0.1381	0.1524	0.1701
retracement_from_trough	0.0000	0.0436	0.1548	0.4053	1.3773
sign_changes_5pct_count	1.0000	2.0000	2.0000	3.0000	5.0000

Source file: slow_decline_with_partial_recovery_qqq.json

Pre-period regime: sideways_low_vol

Sample seeds (first 5): 3099, 3100, 3101, 3102, 3103

Slow Decline With Partial RecoverySPYSIDEWAYS-with-bearish-bias (mixed FM ex-post)

all claims in-band·50 replicas·126d stress + 250d pre-period

Claim validation

Total return over casein-band

Claim: moderate decline with mid-period rebounds permitted, typically −5% to −20%

Measured median: -0.0732

within claimed range

Maximum drawdown (median)in-band

Claim: moderate drawdowns, typically −10% to −25%

Measured median: -0.1266

within claimed range

Aggregated metrics across 50 replicas

Metric	p5	p25	median	p75	p95
total_return	-0.2000	-0.1273	-0.0732	-0.0197	0.0582
realized_vol_annualized	0.1236	0.1313	0.1392	0.1549	0.1633
max_drawdown	-0.2268	-0.1644	-0.1266	-0.0976	-0.0695
autocorrelation_lag1	-0.1567	-0.0916	-0.0048	0.0348	0.1067
kurtosis	-0.6124	-0.2530	-0.0595	0.1335	0.5513
skewness	-0.3220	-0.1267	0.0869	0.2282	0.3996
tail_p1	-0.0254	-0.0215	-0.0197	-0.0179	-0.0163
tail_p99	0.0147	0.0175	0.0196	0.0207	0.0228
sign_change_frequency	0.4274	0.4758	0.5000	0.5464	0.5806
vol_of_vol	0.0126	0.0149	0.0181	0.0209	0.0281
avg_run_length	1.7123	1.8182	1.9841	2.0833	2.3148
rolling_30d_max_dd	-0.1320	-0.1154	-0.0967	-0.0841	-0.0565
crash_window_vol	0.1059	0.1240	0.1369	0.1485	0.1776
retracement_from_trough	0.0000	0.0767	0.2311	0.4131	0.9960
sign_changes_5pct_count	1.0000	2.0000	3.0000	4.0000	5.5500

Source file: slow_decline_with_partial_recovery_spy.json

Pre-period regime: weak_bear

Sample seeds (first 5): 2447, 2448, 2449, 2450, 2451

Slow StagflationQQQSLOW_BEAR + VOL_EXPANSION

1 of 2 claim off-spec·50 replicas·126d stress + 250d pre-period

Imposed conditions for persistent decline combined with elevated realized volatility.

Claim validation

Total return over case (SLOW_BEAR aspect)off-spec

Claim: moderate-to-deep decline, methodology −25% to −60%

Measured median: -0.1522

Measured median (-0.1522) is above the claimed upper bound (-0.2000). The simulator does not produce regime characteristics within the methodology operational band for this profile-asset combination — calibration limit. Resolution path: re-tune the profile (preferred) or revise the regime definition (only after anchor validation).

Realized volatility (VOL_EXPANSION aspect)in-band

Claim: ≥ 1.5× QQQ baseline (~0.32); methodology gating

Measured median: 0.3671

within claimed range

Aggregated metrics across 50 replicas

Metric	p5	p25	median	p75	p95
total_return	-0.4128	-0.2403	-0.1522	0.0418	0.2273
realized_vol_annualized	0.2940	0.3375	0.3671	0.3914	0.4126
max_drawdown	-0.4523	-0.3794	-0.2951	-0.2113	-0.1541
autocorrelation_lag1	-0.1473	-0.0544	0.0069	0.0719	0.1179
kurtosis	-0.5811	-0.2699	-0.0284	0.1653	0.6724
skewness	-0.3003	-0.1353	0.0971	0.2955	0.4440
tail_p1	-0.0627	-0.0570	-0.0502	-0.0456	-0.0366
tail_p99	0.0372	0.0442	0.0482	0.0558	0.0643
sign_change_frequency	0.4302	0.4597	0.4839	0.5242	0.5734
vol_of_vol	0.0318	0.0378	0.0513	0.0615	0.0773
avg_run_length	1.7340	1.8939	2.0492	2.1552	2.3016
rolling_30d_max_dd	-0.3388	-0.2564	-0.2298	-0.1895	-0.1485
crash_window_vol	0.2674	0.3097	0.3560	0.4004	0.4372
retracement_from_trough	0.0000	0.0513	0.1544	0.4308	1.8010
sign_changes_5pct_count	6.0000	9.2500	11.5000	13.0000	15.5500

Source file: slow_stagflation_qqq.json

Pre-period regime: sideways_low_vol

Sample seeds (first 5): 9334, 9335, 9336, 9337, 9338

Slow StagflationSPYSLOW_BEAR + VOL_EXPANSION

1 of 2 claim off-spec·50 replicas·126d stress + 250d pre-period

Imposed conditions for persistent decline combined with elevated realized volatility.

Claim validation

Total return over case (SLOW_BEAR aspect)off-spec

Claim: moderate-to-deep decline, methodology −25% to −60%

Measured median: -0.0588

Measured median (-0.0588) is above the claimed upper bound (-0.2000). The simulator does not produce regime characteristics within the methodology operational band for this profile-asset combination — calibration limit. Resolution path: re-tune the profile (preferred) or revise the regime definition (only after anchor validation).

Realized volatility (VOL_EXPANSION aspect)in-band

Claim: ≥ 1.5× SPY baseline (~0.24); methodology gating

Measured median: 0.2776

within claimed range

Aggregated metrics across 50 replicas

Metric	p5	p25	median	p75	p95
total_return	-0.2968	-0.2082	-0.0588	0.1226	0.2725
realized_vol_annualized	0.2081	0.2545	0.2776	0.2977	0.3206
max_drawdown	-0.3656	-0.2832	-0.2119	-0.1472	-0.1004
autocorrelation_lag1	-0.1156	-0.0555	-0.0126	0.0482	0.1211
kurtosis	-0.6120	-0.4353	-0.1054	0.1322	0.4153
skewness	-0.3063	-0.1060	0.0194	0.1270	0.2362
tail_p1	-0.0477	-0.0425	-0.0378	-0.0343	-0.0265
tail_p99	0.0274	0.0336	0.0364	0.0415	0.0459
sign_change_frequency	0.4230	0.4597	0.5000	0.5323	0.5689
vol_of_vol	0.0205	0.0291	0.0359	0.0402	0.0538
avg_run_length	1.7471	1.8657	1.9841	2.1552	2.3388
rolling_30d_max_dd	-0.2700	-0.2143	-0.1713	-0.1279	-0.1004
crash_window_vol	0.1947	0.2405	0.2657	0.3004	0.3437
retracement_from_trough	0.0000	0.1131	0.2107	0.7435	1.9871
sign_changes_5pct_count	4.0000	6.0000	7.0000	10.0000	11.0000

Source file: slow_stagflation_spy.json

Pre-period regime: sideways_low_vol

Sample seeds (first 5): 22000, 22001, 22002, 22003, 22004

V Recovery Setup SyntheticBTC

all claims in-band·50 replicas·126d stress + 250d pre-period

Claim validation

Aggregated metrics across 50 replicas

Metric	p5	p25	median	p75	p95
total_return	-0.5557	-0.3711	0.1845	0.3790	0.5878
realized_vol_annualized	0.1699	0.1888	0.2068	0.2293	0.2600
max_drawdown	-0.5596	-0.4274	-0.1458	-0.0961	-0.0533
autocorrelation_lag1	-0.2268	-0.1045	-0.0049	0.0856	0.2376
kurtosis	0.7805	2.2580	3.2584	4.8691	6.7696
skewness	-1.3674	-0.7778	-0.2114	0.1544	1.0544
tail_p1	-0.0587	-0.0458	-0.0399	-0.0323	-0.0224
tail_p99	0.0176	0.0285	0.0338	0.0421	0.0584
sign_change_frequency	0.3548	0.3952	0.4435	0.4839	0.5484
vol_of_vol	0.0575	0.0808	0.1021	0.1267	0.1521
avg_run_length	1.8116	2.0492	2.2321	2.5000	2.7778
rolling_30d_max_dd	-0.3192	-0.2373	-0.1440	-0.0950	-0.0533
crash_window_vol	0.1217	0.2451	0.2950	0.3642	0.4370
retracement_from_trough	0.0000	0.0030	1.3836	3.3646	7.8972
sign_changes_5pct_count	1.0000	2.0000	3.0000	5.0000	6.5500

Source file: v_recovery_setup_synthetic_btc.json

Pre-period regime: parabolic_bull

Sample seeds (first 5): 21200, 21201, 21202, 21203, 21204

V Recovery Setup SyntheticGOLD

all claims in-band·50 replicas·126d stress + 250d pre-period

Claim validation

Aggregated metrics across 50 replicas

Metric	p5	p25	median	p75	p95
total_return	-0.5619	-0.1288	0.1283	0.2794	0.6204
realized_vol_annualized	0.1634	0.2004	0.2213	0.2443	0.2766
max_drawdown	-0.5789	-0.2796	-0.1550	-0.1006	-0.0605
autocorrelation_lag1	-0.2069	-0.0541	0.0190	0.1410	0.2051
kurtosis	1.6301	2.8955	3.7089	5.4811	8.5521
skewness	-1.6644	-0.8615	-0.3409	0.2422	0.8821
tail_p1	-0.0635	-0.0485	-0.0408	-0.0331	-0.0265
tail_p99	0.0190	0.0292	0.0352	0.0454	0.0542
sign_change_frequency	0.3504	0.4113	0.4677	0.5000	0.5250
vol_of_vol	0.0633	0.0888	0.1115	0.1341	0.1616
avg_run_length	1.8915	1.9841	2.1186	2.4038	2.8125
rolling_30d_max_dd	-0.3516	-0.2645	-0.1429	-0.1006	-0.0605
crash_window_vol	0.1496	0.2635	0.3020	0.3671	0.4475
retracement_from_trough	0.0000	0.2578	1.6257	2.4624	6.6188
sign_changes_5pct_count	1.0000	3.0000	4.0000	5.0000	7.5500

Source file: v_recovery_setup_synthetic_gold.json

Pre-period regime: inflationary_upcycle

Sample seeds (first 5): 21300, 21301, 21302, 21303, 21304

V Recovery Setup SyntheticQQQ

all claims in-band·50 replicas·126d stress + 250d pre-period

Claim validation

Aggregated metrics across 50 replicas

Metric	p5	p25	median	p75	p95
total_return	-0.5071	-0.4292	0.0917	0.2410	0.6022
realized_vol_annualized	0.1634	0.1883	0.2156	0.2416	0.2673
max_drawdown	-0.5310	-0.4513	-0.1638	-0.1048	-0.0640
autocorrelation_lag1	-0.1680	-0.0384	0.0153	0.0993	0.2214
kurtosis	1.5096	2.7270	3.9481	5.9140	9.0464
skewness	-1.5279	-0.6057	-0.1882	0.3581	1.1029
tail_p1	-0.0573	-0.0454	-0.0376	-0.0338	-0.0247
tail_p99	0.0201	0.0274	0.0377	0.0424	0.0499
sign_change_frequency	0.3335	0.3952	0.4516	0.5081	0.5448
vol_of_vol	0.0653	0.0864	0.1087	0.1297	0.1442
avg_run_length	1.8236	1.9531	2.1930	2.5000	2.9552
rolling_30d_max_dd	-0.2969	-0.2642	-0.1586	-0.0993	-0.0617
crash_window_vol	0.1163	0.2293	0.2989	0.3516	0.4481
retracement_from_trough	0.0000	0.0041	0.9632	2.3685	4.8157
sign_changes_5pct_count	1.0000	3.0000	3.0000	5.0000	6.0000

Source file: v_recovery_setup_synthetic_qqq.json

Pre-period regime: bull_trend

Sample seeds (first 5): 21100, 21101, 21102, 21103, 21104

V Recovery Setup SyntheticSPY

all claims in-band·50 replicas·126d stress + 250d pre-period

Claim validation

Aggregated metrics across 50 replicas

Metric	p5	p25	median	p75	p95
total_return	-0.5756	-0.4696	0.0409	0.2060	0.4113
realized_vol_annualized	0.1751	0.2071	0.2310	0.2482	0.2891
max_drawdown	-0.5997	-0.4849	-0.1776	-0.1094	-0.0586
autocorrelation_lag1	-0.2204	-0.0658	0.0229	0.0974	0.2472
kurtosis	1.3745	2.8337	3.7582	4.7667	7.3036
skewness	-1.4454	-0.7465	-0.2714	0.1480	0.7863
tail_p1	-0.0630	-0.0507	-0.0442	-0.0328	-0.0238
tail_p99	0.0222	0.0301	0.0362	0.0428	0.0548
sign_change_frequency	0.3460	0.4113	0.4597	0.4919	0.5565
vol_of_vol	0.0737	0.0994	0.1163	0.1307	0.1691
avg_run_length	1.7857	2.0161	2.1552	2.4038	2.8488
rolling_30d_max_dd	-0.3622	-0.2534	-0.1709	-0.0935	-0.0586
crash_window_vol	0.1153	0.2487	0.3126	0.3864	0.4715
retracement_from_trough	0.0000	0.0000	0.5560	2.0622	5.1363
sign_changes_5pct_count	1.4500	3.0000	4.0000	5.7500	7.0000

Source file: v_recovery_setup_synthetic_spy.json

Pre-period regime: weak_bear

Sample seeds (first 5): 21000, 21001, 21002, 21003, 21004

Vol Expansion Setup SyntheticBTCVOL_EXPANSION (conditional)

1 of 2 claim off-spec·50 replicas·252d stress + 250d pre-period

Imposed conditions for sustained vol elevation via persistent HFT withdrawal and vol-trader amplification — per-replica conformance to VOL_EXPANSION gating (median ≥1.5× baseline AND ≥2 distinct windows ≥1.5×) varies in the 60-90% band per methodology expectations.

Claim validation

Realized volatility (VOL_EXPANSION median gating)off-spec

Claim: ≥ 1.5× BTC baseline (~1.20); methodology gating

Measured median: 0.3176

Measured median (0.3176) is below the claimed lower bound (1.2000). The simulator does not produce regime characteristics within the methodology operational band for this profile-asset combination — calibration limit. Resolution path: re-tune the profile (preferred) or revise the regime definition (only after anchor validation). (crypto-asset; relative-to-baseline criterion)

Total return over case (descriptive)in-band

Claim: direction-neutral by design; ±20% is acceptable spread

Measured median: 0.1198

within claimed range (crypto-asset; relative-to-baseline criterion)

Aggregated metrics across 50 replicas

Metric	p5	p25	median	p75	p95
total_return	-0.3613	-0.0384	0.1198	0.2892	0.6930
realized_vol_annualized	0.2131	0.2690	0.3176	0.3809	0.4310
max_drawdown	-0.4499	-0.3387	-0.2644	-0.2150	-0.1525
autocorrelation_lag1	-0.0920	-0.0474	0.0008	0.0313	0.1016
kurtosis	-0.3665	-0.2411	-0.0959	0.1080	0.4589
skewness	-0.1605	-0.0622	0.0261	0.1211	0.2114
tail_p1	-0.0610	-0.0535	-0.0437	-0.0372	-0.0291
tail_p99	0.0305	0.0377	0.0465	0.0544	0.0630
sign_change_frequency	0.4578	0.4720	0.4960	0.5150	0.5360
vol_of_vol	0.0309	0.0374	0.0448	0.0552	0.0677
avg_run_length	1.8593	1.9345	2.0080	2.1092	2.1741
rolling_30d_max_dd	-0.3185	-0.2447	-0.1979	-0.1726	-0.1369
crash_window_vol	0.2080	0.2578	0.3084	0.3776	0.4289
retracement_from_trough	0.0618	0.2658	0.5251	0.9212	2.4241
sign_changes_5pct_count	8.4500	13.2500	18.0000	24.0000	29.1000

Source file: vol_expansion_setup_synthetic_btc.json

Pre-period regime: unstable_decay

Sample seeds (first 5): 20500, 20501, 20502, 20503, 20504

Vol Expansion Setup SyntheticGOLDVOL_EXPANSION (conditional)

all claims in-band·50 replicas·252d stress + 250d pre-period

Claim validation

Realized volatility (VOL_EXPANSION median gating)in-band

Claim: ≥ 1.5× GOLD baseline (~0.20); methodology gating

Measured median: 0.2450

within claimed range

Total return over case (descriptive)in-band

Claim: direction-neutral by design; ±20% is acceptable spread

Measured median: -0.0019

within claimed range

Aggregated metrics across 50 replicas

Metric	p5	p25	median	p75	p95
total_return	-0.3317	-0.1405	-0.0019	0.1565	0.4506
realized_vol_annualized	0.1538	0.2002	0.2450	0.2921	0.3321
max_drawdown	-0.4378	-0.3006	-0.2395	-0.1854	-0.1170
autocorrelation_lag1	-0.1028	-0.0550	-0.0119	0.0310	0.0995
kurtosis	-0.4785	-0.2409	-0.0826	0.1432	0.4571
skewness	-0.2258	-0.1074	-0.0038	0.1114	0.1947
tail_p1	-0.0527	-0.0419	-0.0335	-0.0283	-0.0212
tail_p99	0.0212	0.0279	0.0349	0.0423	0.0462
sign_change_frequency	0.4316	0.4840	0.5000	0.5240	0.5600
vol_of_vol	0.0211	0.0279	0.0331	0.0439	0.0580
avg_run_length	1.7801	1.9015	1.9921	2.0574	2.3051
rolling_30d_max_dd	-0.2578	-0.2040	-0.1598	-0.1384	-0.0917
crash_window_vol	0.1412	0.1927	0.2193	0.2725	0.3356
retracement_from_trough	0.0000	0.1581	0.3538	0.6966	2.0628
sign_changes_5pct_count	4.4500	8.0000	13.0000	17.0000	20.5500

Source file: vol_expansion_setup_synthetic_gold.json

Pre-period regime: range

Sample seeds (first 5): 22100, 22101, 22102, 22103, 22104

Vol Expansion Setup SyntheticQQQVOL_EXPANSION (conditional)

all claims in-band·50 replicas·252d stress + 250d pre-period

Claim validation

Realized volatility (VOL_EXPANSION median gating)in-band

Claim: ≥ 1.5× QQQ baseline (~0.32); methodology gating

Measured median: 0.3486

within claimed range

Total return over case (descriptive)in-band

Claim: direction-neutral by design; ±20% is acceptable spread

Measured median: 0.0390

within claimed range

Aggregated metrics across 50 replicas

Metric	p5	p25	median	p75	p95
total_return	-0.3287	-0.1364	0.0390	0.2435	1.0154
realized_vol_annualized	0.2309	0.2751	0.3486	0.3793	0.4186
max_drawdown	-0.5432	-0.3749	-0.2590	-0.2144	-0.1526
autocorrelation_lag1	-0.1135	-0.0475	-0.0073	0.0596	0.1043
kurtosis	-0.4245	-0.2219	0.0210	0.1449	0.4823
skewness	-0.2105	-0.1162	0.0506	0.1186	0.2045
tail_p1	-0.0612	-0.0530	-0.0476	-0.0394	-0.0325
tail_p99	0.0322	0.0409	0.0485	0.0550	0.0642
sign_change_frequency	0.4436	0.4770	0.5040	0.5200	0.5462
vol_of_vol	0.0344	0.0403	0.0495	0.0559	0.0637
avg_run_length	1.8248	1.9160	1.9764	2.0873	2.2433
rolling_30d_max_dd	-0.3320	-0.2592	-0.2036	-0.1749	-0.1297
crash_window_vol	0.2099	0.2746	0.3151	0.3667	0.4042
retracement_from_trough	0.0193	0.1215	0.3224	1.0688	3.2534
sign_changes_5pct_count	9.4500	15.0000	19.0000	24.0000	29.5500

Source file: vol_expansion_setup_synthetic_qqq.json

Pre-period regime: weak_bear

Sample seeds (first 5): 20400, 20401, 20402, 20403, 20404

Vol Expansion Setup SyntheticSPYVOL_EXPANSION (conditional)

all claims in-band·50 replicas·252d stress + 250d pre-period

Claim validation

Realized volatility (VOL_EXPANSION median gating)in-band

Claim: ≥ 1.5× SPY baseline (~0.24); methodology gating

Measured median: 0.3166

within claimed range

Total return over case (descriptive)in-band

Claim: direction-neutral by design; ±20% is acceptable spread

Measured median: 0.1018

within claimed range

Aggregated metrics across 50 replicas

Metric	p5	p25	median	p75	p95
total_return	-0.4036	-0.1318	0.1018	0.3436	0.7392
realized_vol_annualized	0.2200	0.2899	0.3166	0.3546	0.4131
max_drawdown	-0.5270	-0.3277	-0.2431	-0.1888	-0.1407
autocorrelation_lag1	-0.0915	-0.0580	-0.0111	0.0275	0.0822
kurtosis	-0.3909	-0.1229	0.0074	0.2307	0.5636
skewness	-0.2965	-0.1360	0.0208	0.1048	0.2847
tail_p1	-0.0619	-0.0536	-0.0449	-0.0405	-0.0284
tail_p99	0.0328	0.0407	0.0450	0.0518	0.0605
sign_change_frequency	0.4618	0.4800	0.5020	0.5190	0.5560
vol_of_vol	0.0321	0.0395	0.0478	0.0563	0.0654
avg_run_length	1.7929	1.9197	1.9842	2.0744	2.1555
rolling_30d_max_dd	-0.2914	-0.2391	-0.1966	-0.1627	-0.1134
crash_window_vol	0.2157	0.2617	0.3100	0.3682	0.4411
retracement_from_trough	0.0015	0.2157	0.4930	0.9854	3.7619
sign_changes_5pct_count	8.4500	13.5000	19.0000	22.0000	27.5500

Source file: vol_expansion_setup_synthetic_spy.json

Pre-period regime: bull_trend

Sample seeds (first 5): 20300, 20301, 20302, 20303, 20304

Whipsaw SyntheticBTCWHIPSAW + SIDEWAYS

all claims in-band·50 replicas·126d stress + 250d pre-period

Repeated directional reversals with bounded range and no decisive resolution.

Claim validation

Sign-change frequency (descriptive proxy)in-band

Claim: elevated frequency of return-direction reversals (≥ 40% of bars; true ≥3-5 sign-changes >5% magnitude criterion is per-replica — Phase 17)

Measured median: 0.5081

within claimed range (crypto-asset; relative-to-baseline criterion)

Total return over casein-band

Claim: near-zero net direction; methodology ±10%

Measured median: 0.0183

within claimed range (crypto-asset; relative-to-baseline criterion)

Aggregated metrics across 50 replicas

Metric	p5	p25	median	p75	p95
total_return	-0.1714	-0.0730	0.0183	0.0791	0.2134
realized_vol_annualized	0.1262	0.1355	0.1472	0.1675	0.1884
max_drawdown	-0.2062	-0.1442	-0.0993	-0.0751	-0.0553
autocorrelation_lag1	-0.1304	-0.0847	0.0000	0.0521	0.0993
kurtosis	-0.5816	-0.3570	-0.1155	0.1197	0.5667
skewness	-0.3690	-0.1387	-0.0449	0.0286	0.3156
tail_p1	-0.0296	-0.0236	-0.0204	-0.0182	-0.0164
tail_p99	0.0154	0.0181	0.0205	0.0235	0.0265
sign_change_frequency	0.4310	0.4839	0.5081	0.5242	0.5726
vol_of_vol	0.0124	0.0162	0.0204	0.0237	0.0309
avg_run_length	1.7361	1.8939	1.9531	2.0492	2.2959
rolling_30d_max_dd	-0.1318	-0.1027	-0.0852	-0.0687	-0.0547
crash_window_vol	0.1163	0.1355	0.1527	0.1734	0.1910
retracement_from_trough	0.0000	0.1167	0.3954	0.9190	2.8287
sign_changes_5pct_count	1.0000	2.0000	3.0000	4.0000	6.0000

Source file: whipsaw_synthetic_btc.json

Pre-period regime: unstable_decay

Sample seeds (first 5): 9442, 9443, 9444, 9445, 9446

Whipsaw SyntheticETHWHIPSAW + SIDEWAYS

all claims in-band·50 replicas·126d stress + 250d pre-period

Repeated directional reversals with bounded range and no decisive resolution.

Claim validation

Sign-change frequency (descriptive proxy)in-band

Claim: elevated frequency of return-direction reversals (≥ 40% of bars; true ≥3-5 sign-changes >5% magnitude criterion is per-replica — Phase 17)

Measured median: 0.5000

within claimed range (crypto-asset; relative-to-baseline criterion)

Total return over casein-band

Claim: near-zero net direction; methodology ±10%

Measured median: 0.0680

within claimed range (crypto-asset; relative-to-baseline criterion)

Aggregated metrics across 50 replicas

Metric	p5	p25	median	p75	p95
total_return	-0.0734	-0.0381	0.0680	0.1430	0.2737
realized_vol_annualized	0.1249	0.1379	0.1544	0.1641	0.1786
max_drawdown	-0.1611	-0.1361	-0.0969	-0.0700	-0.0446
autocorrelation_lag1	-0.1642	-0.0820	-0.0086	0.0702	0.1486
kurtosis	-0.6546	-0.2653	-0.1235	0.1219	0.4866
skewness	-0.3891	-0.1559	-0.0224	0.1522	0.2944
tail_p1	-0.0259	-0.0228	-0.0208	-0.0189	-0.0171
tail_p99	0.0165	0.0192	0.0211	0.0242	0.0281
sign_change_frequency	0.4149	0.4758	0.5000	0.5242	0.5609
vol_of_vol	0.0129	0.0174	0.0202	0.0240	0.0310
avg_run_length	1.7719	1.8939	1.9841	2.0833	2.3834
rolling_30d_max_dd	-0.1376	-0.1081	-0.0811	-0.0683	-0.0423
crash_window_vol	0.1091	0.1332	0.1412	0.1677	0.2056
retracement_from_trough	0.0072	0.1825	0.4768	1.2396	4.3610
sign_changes_5pct_count	1.0000	2.0000	3.0000	4.0000	5.5500

Source file: whipsaw_synthetic_eth.json

Pre-period regime: unstable_decay

Sample seeds (first 5): 7169, 7170, 7171, 7172, 7173

Whipsaw SyntheticQQQWHIPSAW + SIDEWAYS

all claims in-band·50 replicas·126d stress + 250d pre-period

Repeated directional reversals with bounded range and no decisive resolution.

Claim validation

Sign-change frequency (descriptive proxy)in-band

Claim: elevated frequency of return-direction reversals (≥ 40% of bars; true ≥3-5 sign-changes >5% magnitude criterion is per-replica — Phase 17)

Measured median: 0.5081

within claimed range

Total return over casein-band

Claim: near-zero net direction; methodology ±10%

Measured median: -0.0008

within claimed range

Aggregated metrics across 50 replicas

Metric	p5	p25	median	p75	p95
total_return	-0.1468	-0.0533	-0.0008	0.0608	0.2332
realized_vol_annualized	0.1282	0.1397	0.1482	0.1652	0.1768
max_drawdown	-0.1926	-0.1309	-0.1009	-0.0830	-0.0576
autocorrelation_lag1	-0.1464	-0.0586	0.0065	0.0481	0.1453
kurtosis	-0.5444	-0.2918	-0.1782	0.1159	0.9083
skewness	-0.2575	-0.0837	0.0059	0.1191	0.2150
tail_p1	-0.0257	-0.0230	-0.0208	-0.0192	-0.0168
tail_p99	0.0169	0.0190	0.0208	0.0227	0.0265
sign_change_frequency	0.4383	0.4778	0.5081	0.5403	0.5895
vol_of_vol	0.0129	0.0163	0.0210	0.0258	0.0299
avg_run_length	1.6872	1.8382	1.9531	2.0748	2.2600
rolling_30d_max_dd	-0.1381	-0.1088	-0.0873	-0.0716	-0.0576
crash_window_vol	0.1146	0.1291	0.1460	0.1639	0.1832
retracement_from_trough	0.0000	0.1792	0.4158	0.7625	2.6864
sign_changes_5pct_count	1.4500	3.0000	4.0000	4.0000	5.5500

Source file: whipsaw_synthetic_qqq.json

Pre-period regime: sideways_low_vol

Sample seeds (first 5): 7721, 7722, 7723, 7724, 7725

Whipsaw SyntheticSPYWHIPSAW + SIDEWAYS

all claims in-band·50 replicas·126d stress + 250d pre-period

Repeated directional reversals with bounded range and no decisive resolution.

Claim validation

Sign-change frequency (descriptive proxy)in-band

Claim: elevated frequency of return-direction reversals (≥ 40% of bars; true ≥3-5 sign-changes >5% magnitude criterion is per-replica — Phase 17)

Measured median: 0.4919

within claimed range

Total return over casein-band

Claim: near-zero net direction; methodology ±10%

Measured median: 0.0254

within claimed range

Aggregated metrics across 50 replicas

Metric	p5	p25	median	p75	p95
total_return	-0.1676	-0.0499	0.0254	0.0960	0.2642
realized_vol_annualized	0.1323	0.1414	0.1530	0.1679	0.1764
max_drawdown	-0.1975	-0.1406	-0.0998	-0.0776	-0.0567
autocorrelation_lag1	-0.1175	-0.0412	-0.0001	0.0822	0.1491
kurtosis	-0.5544	-0.3363	-0.1051	0.2382	0.7455
skewness	-0.5647	-0.2063	-0.0261	0.0704	0.2056
tail_p1	-0.0265	-0.0234	-0.0211	-0.0199	-0.0170
tail_p99	0.0161	0.0185	0.0199	0.0240	0.0257
sign_change_frequency	0.4149	0.4597	0.4919	0.5222	0.5492
vol_of_vol	0.0132	0.0178	0.0216	0.0252	0.0302
avg_run_length	1.8094	1.9012	2.0161	2.1552	2.3834
rolling_30d_max_dd	-0.1289	-0.1130	-0.0886	-0.0756	-0.0567
crash_window_vol	0.1329	0.1434	0.1552	0.1755	0.1895
retracement_from_trough	0.0548	0.2299	0.6135	1.2322	3.5027
sign_changes_5pct_count	1.0000	2.2500	3.0000	4.0000	6.0000

Source file: whipsaw_synthetic_spy.json

Pre-period regime: bull_trend

Sample seeds (first 5): 8941, 8942, 8943, 8944, 8945

Metric definitions

total_return: Cumulative log-return over the stress-period window (warmup excluded).
realized_vol_annualized: Standard deviation of daily log-returns × √252.
max_drawdown: Peak-to-trough drawdown of the cumulative-equity curve (negative value).
autocorrelation_lag1: Pearson correlation of log-returns at lag 1.
kurtosis: Excess kurtosis of log-returns (zero = Gaussian).
skewness: Pearson skewness of log-returns.
tail_p1: 1st-percentile single-bar log-return (left-tail proxy).
tail_p99: 99th-percentile single-bar log-return (right-tail proxy).
sign_change_frequency: Fraction of bars where the return changes sign vs. the previous bar.
vol_of_vol: Standard deviation of rolling 20-day realized volatility.
avg_run_length: Average length of consecutive same-sign return runs (regime-persistence proxy).