AI Leaderboard — which engine picks best, per asset class
Per-AI pick attribution: every swarm pick is fanned to the engine(s) that
produced it (models_consulted[].underlying_model), joined to
its realized outcome, and ranked. Click an engine row to drill into its
per-asset-class and per-horizon breakdown.
Built by tools/ai_attribution/build_ai_leaderboard.py;
design: reports/design_per_ai_pick_attribution_2026-05-15.md.
⚠ Monte-Carlo verdict (2026-05-29): these high Profit Factors are trend-following shape, not edge. An independent MC test (tools/backtest_ma_trend_montecarlo.py, yfinance 6y, no-lookahead) found the 200-day MA underperforms buy-and-hold risk-adjusted in all 7 asset classes (lower Sharpe everywhere) and fails the timing test — being long on above-MA days is no better than the same number of random days (random-day-selection null p = 0.28–0.67, all ≫ 0.05). The only genuine benefit is lower max-drawdown. Treat MA-trend as a defensive / regime overlay, NOT an alpha source; the PF column below is the natural high-PF/low-WR signature of trend-following (few big winners), not proof of an edge. Full report: reports/ma_trend_montecarlo_verdict_2026-05-29.md. NFA.
⛔ SYNTHETIC DATA NOTE (2026-06-06): 1,636 of ~7,100 tournament picks are SYNTHETIC_SEED_ENRICHED (machine-generated by populate_picks.py, not real AI predictions). cursor_agent: 100% synthetic resolved cohort. llama4_scout: 43% synthetic. Models with 0% synthetic & n≥30: grok3 (WR=67% n=52 real), together_qwen_3 (WR=50% n=24). Treat WRs for heavily-synthetic models as artifacts, not forward-test results. Only grok3 currently qualifies as a real-money model candidate (0% synthetic, n≥30 resolved).
Disclaimer: This is NOT financial advice. All trading signals, picks, scores, and analysis are for educational and research purposes only. Past performance does not guarantee future results. Trading cryptocurrencies involves substantial risk of loss. Always do your own research (DYOR) before making any investment decisions.