skills/terrylica/cc-skills/adaptive-wfo-epoch

adaptive-wfo-epoch

SKILL.md

Adaptive Walk-Forward Epoch Selection (AWFES)

Machine-readable reference for adaptive epoch selection within Walk-Forward Optimization (WFO). Optimizes training epochs per-fold using Walk-Forward Efficiency (WFE) as the objective.

When to Use This Skill

Use this skill when:

  • Selecting optimal training epochs for ML models in WFO
  • Avoiding overfitting via Walk-Forward Efficiency metrics
  • Implementing per-fold adaptive epoch selection
  • Computing efficient frontiers for epoch-performance trade-offs
  • Carrying epoch priors across WFO folds

Quick Start

from adaptive_wfo_epoch import AWFESConfig, compute_efficient_frontier

# Generate epoch candidates from search bounds and granularity
config = AWFESConfig.from_search_space(
    min_epoch=100,
    max_epoch=2000,
    granularity=5,  # Number of frontier points
)
# config.epoch_configs → [100, 211, 447, 945, 2000] (log-spaced)

# Per-fold epoch sweep
for fold in wfo_folds:
    epoch_metrics = []
    for epoch in config.epoch_configs:
        is_sharpe, oos_sharpe = train_and_evaluate(fold, epochs=epoch)
        wfe = config.compute_wfe(is_sharpe, oos_sharpe, n_samples=len(fold.train))
        epoch_metrics.append({"epoch": epoch, "wfe": wfe, "is_sharpe": is_sharpe})

    # Select from efficient frontier
    selected_epoch = compute_efficient_frontier(epoch_metrics)

    # Carry forward to next fold as prior
    prior_epoch = selected_epoch

Methodology Overview

What This Is

Per-fold adaptive epoch selection where:

  1. Train models across a range of epochs (e.g., 400, 800, 1000, 2000)
  2. Compute WFE = OOS_Sharpe / IS_Sharpe for each epoch count
  3. Find the "efficient frontier" - epochs maximizing WFE vs training cost
  4. Select optimal epoch from frontier for OOS evaluation
  5. Carry forward as prior for next fold

What This Is NOT

  • NOT early stopping: Early stopping monitors validation loss continuously; this evaluates discrete candidates post-hoc
  • NOT Bayesian optimization: No surrogate model; direct evaluation of all candidates
  • NOT nested cross-validation: Uses temporal WFO, not shuffled splits

Academic Foundations

Concept Citation Key Insight
Walk-Forward Efficiency Pardo (1992, 2008) WFE = OOS_Return / IS_Return as robustness metric
Deflated Sharpe Ratio Bailey & López de Prado (2014) Adjusts for multiple testing
Pareto-Optimal HP Selection Bischl et al. (2023) Multi-objective hyperparameter optimization
Warm-Starting Nomura & Ono (2021) Transfer knowledge between optimization runs

See references/academic-foundations.md for full literature review.

Core Formula: Walk-Forward Efficiency

def compute_wfe(
    is_sharpe: float,
    oos_sharpe: float,
    n_samples: int | None = None,
) -> float | None:
    """Walk-Forward Efficiency - measures performance transfer.

    WFE = OOS_Sharpe / IS_Sharpe

    Interpretation (guidelines, not hard thresholds):
    - WFE ≥ 0.70: Excellent transfer (low overfitting)
    - WFE 0.50-0.70: Good transfer
    - WFE 0.30-0.50: Moderate transfer (investigate)
    - WFE < 0.30: Severe overfitting (likely reject)

    The IS_Sharpe minimum is derived from signal-to-noise ratio,
    not a fixed magic number. See compute_is_sharpe_threshold().

    Reference: Pardo (2008) "The Evaluation and Optimization of Trading Strategies"
    """
    # Data-driven threshold: IS_Sharpe must exceed 2σ noise floor
    min_is_sharpe = compute_is_sharpe_threshold(n_samples) if n_samples else 0.1

    if abs(is_sharpe) < min_is_sharpe:
        return None
    return oos_sharpe / is_sharpe

Principled Configuration Framework

All parameters are derived from first principles or data characteristics. AWFESConfig provides unified configuration with log-spaced epoch generation, Bayesian variance derivation from search space, and market-specific annualization factors.

See references/configuration-framework.md for the full AWFESConfig class and compute_is_sharpe_threshold() implementation.

Guardrails (Principled Guidelines)

  • G1: WFE Thresholds - 0.30 (reject), 0.50 (warning), 0.70 (target) based on practitioner consensus
  • G2: IS_Sharpe Minimum - Data-driven threshold: 2/sqrt(n) adapts to sample size
  • G3: Stability Penalty - Adaptive threshold derived from WFE variance prevents epoch churn
  • G4: DSR Adjustment - Deflated Sharpe corrects for epoch selection multiplicity via Gumbel distribution

See references/guardrails.md for full implementations of all guardrails.

WFE Aggregation Methods

Under the null hypothesis, WFE follows a Cauchy distribution (no defined mean). Always prefer median or pooled methods:

  • Pooled WFE: Precision-weighted by sample size (best for variable fold sizes)
  • Median WFE: Robust to outliers (best for suspected regime changes)
  • Weighted Mean: Inverse-variance weighting (best for homogeneous folds)

See references/wfe-aggregation.md for implementations and selection guide.

Efficient Frontier Algorithm

Pareto-optimal epoch selection: an epoch is on the frontier if no other epoch dominates it (better WFE AND lower training time). The AdaptiveEpochSelector class maintains state across folds with adaptive stability penalties.

See references/efficient-frontier.md for the full algorithm and carry-forward mechanism.

Anti-Patterns

Anti-Pattern Symptom Fix Severity
Expanding window (range bars) Train size grows per fold Use fixed sliding window CRITICAL
Peak picking Best epoch always at sweep boundary Expand range, check for plateau HIGH
Insufficient folds effective_n < 30 Increase folds or data span HIGH
Ignoring temporal autocorr Folds correlated Use purged CV, gap between folds HIGH
Overfitting to IS IS >> OOS Sharpe Reduce epochs, add regularization HIGH
sqrt(252) for crypto Inflated Sharpe Use sqrt(365) or sqrt(7) weekly MEDIUM
Single epoch selection No uncertainty quantification Report confidence interval MEDIUM
Meta-overfitting Epoch selection itself overfits Limit to 3-4 candidates max HIGH

CRITICAL: Never use expanding window for range bar ML training. See references/anti-patterns.md for the full analysis (Section 7).

Decision Tree

See references/epoch-selection-decision-tree.md for the full practitioner decision tree.

Start
  ├─ IS_Sharpe > compute_is_sharpe_threshold(n)? ──NO──> Mark WFE invalid, use fallback
  │         │                                            (threshold = 2/√n, adapts to sample size)
  │        YES
  │         │
  ├─ Compute WFE for each epoch
  │         │
  ├─ Any WFE > 0.30? ──NO──> REJECT all epochs (severe overfit)
  │         │                (guideline, not hard threshold)
  │        YES
  │         │
  ├─ Compute efficient frontier
  │         │
  ├─ Apply AdaptiveStabilityPenalty
  │         │ (threshold derived from WFE variance)
  └─> Return selected epoch

Integration with rangebar-eval-metrics

This skill extends rangebar-eval-metrics:

Metric Source Used For Reference
sharpe_tw WFE numerator (OOS) and denominator (IS) range-bar-metrics.md
n_bars Sample size for aggregation weights metrics-schema.md
psr, dsr Final acceptance criteria sharpe-formulas.md
prediction_autocorr Validate model isn't collapsed ml-prediction-quality.md
is_collapsed Model health check ml-prediction-quality.md
Extended risk metrics Deep risk analysis (optional) risk-metrics.md

Recommended Workflow

  1. Compute base metrics using rangebar-eval-metrics:compute_metrics.py
  2. Feed to AWFES for epoch selection with sharpe_tw as primary signal
  3. Validate with psr > 0.85 and dsr > 0.50 before deployment
  4. Monitor is_collapsed and prediction_autocorr for model health

OOS Application Phase

AWFES uses Nested WFO with three data splits per fold (Train 60% / Val 20% / Test 20%) with 6% embargo gaps at each boundary. The per-fold workflow: epoch sweep on train, WFE computation on validation, Bayesian update, final model training on train+val, evaluation on test.

See references/oos-workflow.md for the complete workflow with diagrams, BayesianEpochSelector class, and apply_awfes_to_test() implementation. Also see references/oos-application.md for the extended reference.

Epoch Smoothing Methods

Bayesian updating (recommended) provides principled, uncertainty-aware smoothing. Alternatives include EMA and SMA. Initialization via AWFESConfig.from_search_space() derives variances from the epoch range automatically.

See references/epoch-smoothing-methods.md for all methods, formulas, and initialization strategies. See references/epoch-smoothing.md for extended mathematical analysis.

OOS Metrics Specification

Three-tier metric hierarchy for test evaluation:

  • Tier 1 (Primary): sharpe_tw, hit_rate, cumulative_pnl, positive_sharpe_folds, wfe_test
  • Tier 2 (Risk): max_drawdown, calmar_ratio, profit_factor, cvar_10pct
  • Tier 3 (Statistical): psr, dsr, binomial_pvalue, hac_ttest_pvalue

See references/oos-metrics-implementation.md for full metric tables, compute_oos_metrics(), and fold aggregation code. See references/oos-metrics.md for threshold justifications.

Look-Ahead Bias Prevention

CRITICAL (v3 fix): TEST must use prior_bayesian_epoch (from prior folds only), NOT val_optimal_epoch. The Bayesian update happens AFTER test evaluation, ensuring information flows only from past to present.

See references/look-ahead-bias-v3.md for the v3 fix details, embargo requirements, validation checklist, and anti-patterns. See references/look-ahead-bias.md for detailed examples.


References

Topic Reference File
Academic Literature academic-foundations.md
Mathematical Formulation mathematical-formulation.md
Configuration Framework configuration-framework.md
Guardrails guardrails.md
WFE Aggregation wfe-aggregation.md
Efficient Frontier efficient-frontier.md
Decision Tree epoch-selection-decision-tree.md
Anti-Patterns anti-patterns.md
OOS Workflow oos-workflow.md
OOS Application oos-application.md
Epoch Smoothing Methods epoch-smoothing-methods.md
Epoch Smoothing Analysis epoch-smoothing.md
OOS Metrics Impl oos-metrics-implementation.md
OOS Metrics Thresholds oos-metrics.md
Look-Ahead Bias (v3) look-ahead-bias-v3.md
Look-Ahead Bias Examples look-ahead-bias.md
Feature Sets feature-sets.md
xLSTM Implementation xlstm-implementation.md
Range Bar Metrics range-bar-metrics.md
Troubleshooting troubleshooting.md

Full Citations

  • Bailey, D. H., & López de Prado, M. (2014). The deflated Sharpe ratio: Correcting for selection bias, backtest overfitting and non-normality. The Journal of Portfolio Management, 40(5), 94-107.
  • Bischl, B., et al. (2023). Multi-Objective Hyperparameter Optimization in Machine Learning. ACM Transactions on Evolutionary Learning and Optimization.
  • López de Prado, M. (2018). Advances in Financial Machine Learning. Wiley. Chapter 7.
  • Nomura, M., & Ono, I. (2021). Warm Starting CMA-ES for Hyperparameter Optimization. AAAI Conference on Artificial Intelligence.
  • Pardo, R. E. (2008). The Evaluation and Optimization of Trading Strategies, 2nd Edition. John Wiley & Sons.
Weekly Installs
56
GitHub Stars
19
First Seen
Jan 24, 2026
Installed on
opencode54
gemini-cli52
codex50
claude-code49
cursor49
github-copilot48