How it works
Methodology
CrashTestYourStrategy runs a trading strategy — or a whole portfolio — through a wide range of market scenarios and reports how its behaviour holds up: where it stays robust and where it degrades.
It is a model-based scenario simulation, and one boundary defines everything below: it is descriptive, not predictive. It does not forecast returns or tell you whether a strategy is "good". It shows how a strategy behaves under defined conditions — historical and hypothetical — so you can see its failure modes before a real market does. This page explains the concepts; the full validated detail lives in the technical reference.
How we generate markets — a regime-switching simulator
Most stress tests either replay history or shuffle random noise. We do neither. Our scenarios come from a regime-switching stochastic-volatility simulator. A market moves through a small set of regimes — a calm uptrend, a directionless range, a grinding bear, an acute crisis — and each regime imposes a characteristic drift and volatility level. The path is not locked to one regime: it transitions between them along a matrix estimated from decades of real index history, so the sequence of calm and stressed stretches matches how real markets actually alternate.
Crucially, volatility carries its own memory — a turbulent day raises the odds the next day is turbulent too. That is why large moves arrive in clusters rather than spread evenly, and it is the single most universal feature of real returns. Each asset — equities, Treasuries, gold, crypto — is calibrated to its own historical volatility, tail thickness and regime mix, so a simulated path carries the character of that instrument rather than a generic default.
When several assets are simulated together for a portfolio, a shared stress factor couples them. In ordinary conditions their correlations stay loose; in a crisis they strengthen the way real ones do — so a stock-bond hedge that holds in calm markets can break when both legs fall together, exactly the failure a portfolio stress test exists to surface.
Matching real market behaviour — stylized facts
Real markets share statistical regularities that hold across assets and decades, independent of any single crisis: volatility clusters; large drops are more common than equally large jumps; extreme moves happen far more often than a bell curve predicts; and volatility tends to rise as prices fall (Cont, 2001). These regularities arise from market structure — they would apply to a future crisis we haven't seen as much as to past ones. A stress scenario is only meaningful if it reproduces them. Ours are built to, as a property of the model's structure, and then measured against each asset's real history rather than asserted — that is what separates a plausible synthetic market from arbitrary noise. How closely they match, fact by fact, is published on the verifiability page.
What makes a scenario a valid stress test
Among these regularities, one is decisive and is enforced as a hard requirement: volatility clustering — the tendency of large moves to follow large moves, and of calm to follow calm. It is the most robust and universal property of financial returns (Cont, 2001), and its presence is what separates a market from random noise. A scenario whose volatility does not cluster — or clusters inversely — is not a demanding market but an artifact; a strategy evaluated against it would be tuned to exploit behaviour no real market exhibits. Every synthetic scenario published here must therefore reproduce positive volatility clustering, measured against the asset's own historical level. Scenarios that do not meet this criterion are not published — it is a gate, not a preference.
The remaining properties — and how strongly each scenario expresses the failure mode it is named for — are reported as descriptive character, shown per scenario in the technical reference, rather than enforced as gates. Where reproducing realistic clustering and maximising a particular failure-mode intensity pull in different directions, clustering takes precedence: a price path a strategy genuinely responds to is worth more than one that fits a label more neatly.
How the backtest works
Your strategy is run over each scenario exactly as it would trade live: signals fire, positions open and close, costs apply. Where a single price bar is ambiguous — its high and low could be reached in either order within the bar — the engine resolves it with a documented, conservative rule (following Löw, Maier-Paape & Platen, 2015) rather than assuming the favourable order. The output is a behaviour record per scenario, not a single optimistic number.
Two kinds of scenario: empirical + synthetic
The catalog combines real historical episodes — empirical anchors like Lehman 2008, the Dotcom unwind, COVID March 2020, the 2022 crypto deleveraging — with simulator-generated synthetic stress probes for stress types that are rare or absent in a given asset's record (sustained slow declines, volatility expansions, whipsaw regimes, liquidity stress). History tells you how a strategy did in the crises that occurred; synthetic probes let you ask what would happen under plausible conditions that haven't occurred yet — which backtesting alone cannot give you.
What the robustness score means — and what it doesn't
The score summarises how consistently a strategy holds up across the scenario catalog: smaller drawdowns, fewer regime-dependent breakdowns, and more stable behaviour score higher. What it is not: a forecast of future returns, a buy/sell signal, a ranking against other strategies, or advice. A high score means a strategy degraded less across these defined scenarios — it does not mean it will be profitable, suitable for you, or safe in the next real crisis. The score is a lens on fragility, nothing more.
The failure-mode taxonomy
Every scenario is tagged with one or more failure-mode axes — the controlled vocabulary the platform uses to describe how a strategy degrades (V-Recovery is treated as a diagnostic path-pattern, leaving nine score buckets). Each axis has its own citation-ready definition.
Plausibility of the synthetic market data
Each synthetic scenario family carries an explicit status:
- validatedstatistically consistent with that asset's historical behaviour on the tested properties.
- in developmentmechanism identified, not yet fully validated.
- residuala known, disclosed limit of the method.
The full validation results, per-scenario character, reference bands, and documented limits live in the technical reference and the verifiability page — including where the method stops being reliable.
Go deeper
Technical reference
Simulator architecture, the clustering gate, per-asset bands, the validated regime catalog, documented limits.
Verifiability
How the claims are checked — stylized facts vs reference bands, value vs a cheap baseline.
Capability declaration
Machine-readable schemas, controlled vocabularies, and the response envelope.
Operated under German jurisdiction (BaFin / WpHG framework). Model-based scenario simulation — descriptive, not advisory; scenario-based, not a forecast. Not investment advice.