Question

How are trading strategies stress-tested on CrashTestYourStrategy?

Methodology gateway: case catalog, failure-mode taxonomy, V2 scoring with a-priori augmentation.

Short answer

Strategies are evaluated against a curated case set of 31 empirical anchors (Lehman 2008, Dotcom 2000, COVID 2020, Luna 2022, etc.) and 32 synthetic stress probes (50 Monte-Carlo replicas each, generated by a Hybrid Field-SME agent-based simulator with per-asset CoT calibration). Each replica is classified ex-post into a failure-mode bucket via operational gating definitions; sparse buckets are augmented from a-priori tags (Phase 18d). The composite robustness score (0-100) is the equal-weight mean across populated FM-buckets with shrinkage. Currently 13 reference strategies are pre-computed in the catalog.

Pre-computed catalog strategies

StrategyAssetScore
RSI with SMA 200 Trend FilterSPY76 / 100
MACD Signal Line CrossoverSPY60 / 100
Supertrend StrategySPY60 / 100
Bollinger Band BounceSPY60 / 100
Buy and Hold S&P 500SPY59 / 100
EMA 12/26 Crossover (BTC)BTC55 / 100
EMA 12/26 CrossoverSPY54 / 100
Buy and Hold BitcoinBTC53 / 100
SMA 200 Trend Following (GOLD)GOLD51 / 100
RSI 30/70 Mean ReversionSPY51 / 100
RSI 30/70 Mean Reversion (BTC)BTC51 / 100
SMA 200 CrossoverSPY51 / 100
MACD + RSI Confirmation (BTC)BTC50 / 100

For the full taxonomy, see /ontology. For the machine-readable schema, see /interop. The V2 score path is calculate_robustness_score_v2 in robustness_calculator.py.

Related failure-mode definitions

Related questions

Data is sourced from the curated case set: 31 empirical anchors + 32 synthetic stress probes across 9 score buckets, V2 per-FM-bucket scoring with a-priori augmentation. See /methodology.

Programmatic access: POST /api/v1/agent/analyze · GET /api/v1/catalog/query · see /interop.