Technical reference

Simulator & Validation Reference

The deep companion to the methodology: how the simulator is built, the gate every scenario must pass, the historical bands we validate against, the character of each validated regime, and how assets move together in a crisis.

1 · What the simulator is

A regime-switching stochastic-volatility model — not a replay of history and not random noise. A market moves through four regimes — BULL · SIDEWAYS · BEAR · CRISIS — and each imposes a characteristic drift and volatility level. The path is not locked to one regime: it transitions between them along a matrix estimated from real index history (1999–2025), which reproduces the historical regime distribution (roughly 40 / 39 / 11 / 10 %), so the alternation of calm and stressed stretches matches how real markets behave.

Volatility clustering is produced by a persistent volatility process layered on that regime oscillation: a turbulent day raises the volatility state for the days that follow (an AR(1)-like memory, roughly a four-day half-life), so large moves arrive in clusters rather than spread evenly. This is the single most universal property of real returns, and it is the model’s load-bearing feature — the admissibility gate in §3 enforces it.

Per-asset character comes from calibrating each asset to its own price history — its volatility level, tail thickness, drawdown profile and regime mix — not from hand-set knobs. The per-asset targets are the reference bands in §4.

2 · Per-asset calibration & coverage

Every asset is calibrated to its own historical statistics, and a simulated path is admissible only when its measured properties fall inside that asset’s bands (§4) — not a generic default. Clustering, volatility and tail thickness differ markedly across assets, so a "low" clustering figure on gold or crypto can be asset-appropriate rather than a defect.

Coverage. The example strategy reports span seven assets — SPY, QQQ, GOLD, WTI, BTC, ETH (VIX is held back — see §9). The live portfolio-stress substrate covers four — SPY, TLT, GOLD, BTC — the set for which the cross-asset coupling (§6) is calibrated.

3 · The admissibility gate: volatility clustering

A synthetic regime is only a valid market test if it reproduces the most fundamental stylized fact: positive volatility clustering (the lag-5 autocorrelation of absolute returns, "acf5", per Cont, 2001). A scenario with zero or negative clustering is an anti-market — a strategy "optimised" against it is optimising against a simulation artifact. So acf5 is a hard gate: every deployable regime must clear it, against the asset’s own historical band (clustering is asset-specific — see §4).

This gate is load-bearing: clearing it is the difference between a plausible market and noise. Every regime in the catalog below passes it, measured against its own asset’s band.

4 · Per-asset reference bands

The historical targets we validate against — medians over rolling 252-day windows. Clustering is structurally lower on GOLD / BTC than on the equity indices, so a "low" acf5 there can be asset-appropriate, not a bug.

asset	acf5	vol (ann.)	kurtosis	skew
SPY	+0.106	0.164	1.51	−0.27
QQQ	+0.111	0.200	1.51	−0.24
GOLD	+0.057	0.159	1.77	−0.23
WTI	+0.093	0.320	1.12	−0.16
BTC	+0.062	0.548	2.91	−0.04

5 · The validated synthetic regime catalog

Thirty-two deployable regimes, validated at n = 50 replicas. The figures are the character of the regime itself — the market the strategy is tested in — not a strategy outcome or a forecast. ret is the median return over the stress window, acf5 the median clustering, vol annualised. All clear the gate (acf5 > 0).

s = sustained (the regime evolves over the full window) · f = sharp-event (the crash window is held, which is why acf5 is high by design). Harsh magnitudes (slow-crash, liquidity-stress) are intentionally adversarial / event-appropriate.

regime	asset	ret	acf5	vol	kind
slow_crash_no_recovery	SPY	−47%	+0.115	0.21	s
slow_crash_no_recovery	QQQ	−50%	+0.095	0.22	s
slow_crash_no_recovery	GOLD	−53%	+0.086	0.21	s
slow_stagflation	SPY	−27%	+0.043	0.26	s
slow_stagflation	QQQ	−20%	+0.060	0.27	s
slow_decline	SPY	−15%	+0.076	0.18	s
slow_decline	QQQ	−15%	+0.065	0.18	s
slow_decline	GOLD	−24%	+0.091	0.18	s
demand_destruction	WTI	−13%	+0.093	0.26	s
liquidity_stress	SPY	−26%	+0.051	0.35	s
liquidity_stress	WTI	−27%	+0.043	0.46	s
liquidity_stress	BTC	−15%	+0.055	0.39	s
vol_expansion	SPY	+10%	+0.163	0.24	s
vol_expansion	QQQ	+6%	+0.162	0.27	s
vol_expansion	GOLD	−1%	+0.159	0.27	s
vol_expansion	BTC	+25%	+0.109	0.25	s
whipsaw	SPY	+2%	+0.050	0.17	s
whipsaw	QQQ	+5%	+0.035	0.15	s
whipsaw	BTC	+7%	+0.043	0.16	s
whipsaw	ETH	+11%	+0.049	0.17	s
low_vol_grind	SPY	+11%	+0.077	0.10	s
low_vol_grind	QQQ	+14%	+0.068	0.11	s
low_vol_grind	GOLD	+5%	+0.015	0.10	s
low_vol_grind	WTI	−0%	+0.074	0.11	s
hyperinflation_boost	GOLD	+29%	+0.042	0.17	s
sharp_crash	SPY	−4%	+0.333	0.29	f
sharp_crash	QQQ	−12%	+0.321	0.29	f
sharp_crash	BTC	−5%	+0.306	0.29	f
v_recovery	SPY	+7%	+0.233	0.23	f
v_recovery	QQQ	+11%	+0.255	0.23	f
v_recovery	GOLD	+11%	+0.208	0.22	f
v_recovery	BTC	−3%	+0.207	0.22	f

6 · Cross-asset coupling & the stock-bond hedge

For portfolio stress, several assets are simulated jointly and coupled by a three-part factor channel: a regime-gated equity-risk factor that strengthens correlations as markets fall, a persistent duration factor shared by Treasuries and gold, and an episodic rate-shock factor. The result reproduces the property a portfolio stress test exists to surface: a stock-bond hedge that holds in calm markets can break when both legs fall together. The table compares each pair’s calm-to-crisis correlation shift, real history versus the coupled model.

pair	calm (real / model)	crisis (real / model)
SPY–TLT	−0.16 / −0.04	−0.42 / −0.42
SPY–BTC	+0.08 / +0.07	+0.45 / +0.30
TLT–GOLD	+0.19 / +0.32	+0.13 / +0.15

The 2022 hedge break is modelled the way it actually happened: not as a spike in day-to-day correlation, but as a shared drawdown — on a rate shock, equities and long Treasuries both lose ground together (a 60/40 takes roughly a −14 % worst-episode drawdown in that scenario, versus the −10 % a drawdown-stop would have held it to). That is the "60/40 loses both legs" failure a stress tool must show. Honest boundary: the model adds value in the regime-conditional correlation and the joint-crash tail dependence — single-asset daily marginals are at parity with cheap baselines, and the SPY–BTC crisis magnitude is undershot. The full numeric comparison against a Gauss-copula baseline is on the verifiability page.

7 · Backtest correctness

OHLC bars hide the intra-bar order of events: when price touches both a stop-loss and a take-profit within the same bar, the execution order is ambiguous. Many engines resolve that optimistically. This engine applies the methodology of Löw, Maier-Paape & Platen (2015), defaulting to worst-case execution (the unfavourable order is assumed). Best-case and ignore modes are available as explicit bounds. The conservative default yields lower-bound performance estimates rather than optimistic ones.

8 · How we validate

Stylized-fact validation. The model’s daily stylized facts are measured over 200 random 252-day windows of the simulated paths and compared, fact by fact, to bands derived identically from real history. SPY and TLT match all eighteen; GOLD and BTC match seventeen of eighteen (kurtosis below band — see §9).
Value vs a cheap baseline. The cross-asset model is compared numerically to a Gauss-copula: it adds regime-conditional correlation that strengthens in a crisis and joint-crash tail dependence the copula cannot — while single-asset marginals are at parity with cheap baselines (no unique edge claimed there). Stated, then checked, on /verifiability.
Anchored bands. Every claimed sign and magnitude is checked against the historical reference band at the point of use, so an out-of-band figure is caught before it propagates.
Pre-registration. Hypotheses and falsification thresholds are registered before a run, to prevent reading success into noise.
Statistical power. A single passing KS-test at small n is not "indistinguishable from historical". Magnitude is validated at the deploy sample size, not a cheap screening size — small samples are optimistically biased for high-variance regimes.
Claim–evidence alignment. "Solved" has to hold across the full shape, not one favourable descriptor.

9 · Documented limits

What the model does not yet do well:

Short-horizon shape

Daily lag-1 autocorrelation runs slightly higher than the historical reference (over-clustered at the shortest lag). A cross-scale refinement closes most of the gap; it is not yet in the shipped engine.

Tail thickness on GOLD / BTC

The synthetic daily tails for gold and crypto are thinner than the historical reference — kurtosis comes out below the asset band. Disclosed as a residual; it is the one stylized fact (of eighteen) those two assets miss.

Below-band clustering

A few regimes (liquidity-stress at extreme volatility, the calm grind profiles) pass the gate positive but below their asset’s typical clustering band. Disclosed; inherent to extreme-vol or calm-bull paths, not a defect.

Intraday structure

The engine is built for daily bars; intraday volatility is uniform (no open-spike seasonality). Irrelevant at daily resolution; multi-resolution would need a separate time-of-day lever.

Multi-episode crashes

A crash with a genuine relief rally followed by a second decline is not yet generated reliably, and is not part of the published scenario set.

VIX

VIX is not currently offered. The simulator cannot yet generate reliable synthetic volatility-index dynamics — its tail behaviour comes out too thin — so VIX is held back until a future iteration handles it.

10 · References

Hamilton, J. D. (1989) "A New Approach to the Economic Analysis of Nonstationary Time Series and the Business Cycle" Econometrica 57(2), 357–384 — The Markov regime-switching framework the simulator’s regime layer is built on.
Cont, R. (2001) "Empirical Properties of Asset Returns: Stylized Facts and Statistical Issues" Quantitative Finance 1(2), 223–236 — The empirical regularities (vol-clustering, fat tails, regime persistence) the synthetic paths are validated against.
Löw, R. W., Maier-Paape, S. & Platen, E. (2015) "Correctness of Backtest Engines" arXiv:1509.08248 — Handling ambiguous intra-bar events; the engine implements their worst-case execution model.

Back to methodology Verifiability API / schemas

Operated under German jurisdiction (BaFin / WpHG framework). Model-based scenario simulation — descriptive, not advisory; scenario-based, not a forecast. Not investment advice.