VectorBT Backtesting Expert Skill

Environment

Python with vectorbt, pandas, numpy, plotly
Data sources: OpenAlgo (Indian markets), DuckDB (direct database), yfinance (US/Global), CCXT (Crypto), custom providers
DuckDB support: supports both custom DuckDB and OpenAlgo Historify format
API keys loaded from single root .env via python-dotenv + find_dotenv() — never hardcode keys
Technical indicators: TA-Lib (ALWAYS - never use VectorBT built-in indicators)
Specialty indicators: openalgo.ta for Supertrend, Donchian, Ichimoku, HMA, KAMA, ALMA, ZLEMA, VWMA
Signal cleaning: openalgo.ta for exrem, crossover, crossunder, flip
Fee model: Indian market standard (STT + statutory charges + Rs 20/order)
Benchmark: NIFTY 50 via OpenAlgo (NSE_INDEX) by default
Charts: Plotly with template="plotly_dark"
Environment variables loaded from single .env at project root via find_dotenv() (walks up from script dir)
Scripts go in backtesting/{strategy_name}/ directories (created on-demand, not pre-created)
Never use icons/emojis in code or logger output

Critical Rules

ALWAYS use TA-Lib for ALL technical indicators (EMA, SMA, RSI, MACD, BBANDS, ATR, ADX, STDDEV, MOM). NEVER use vbt.MA.run(), vbt.RSI.run(), or any VectorBT built-in indicator.
Use OpenAlgo ta for indicators NOT in TA-Lib: Supertrend, Donchian, Ichimoku, HMA, KAMA, ALMA, ZLEMA, VWMA.
Use OpenAlgo ta for signal utilities: ta.exrem(), ta.crossover(), ta.crossunder(), ta.flip(). If openalgo.ta is not importable (standalone DuckDB), use inline exrem() fallback. See duckdb-data.
Always clean signals with ta.exrem() after generating raw buy/sell signals. Always .fillna(False) before exrem.
Market-specific fees: India (indian-market-costs), US (us-market-costs), Crypto (crypto-market-costs). Auto-select based on user's market.
Default benchmarks: India=NIFTY via OpenAlgo, US=S&P 500 (^GSPC), Crypto=Bitcoin (BTC-USD). See data-fetching Market Selection Guide.
Always produce a Strategy vs Benchmark comparison table after every backtest.
Always explain the backtest report in plain language so even normal traders understand risk and strength.
Plotly candlestick charts must use xaxis type="category" to avoid weekend gaps.
Whole shares: Always set min_size=1, size_granularity=1 for equities.
DuckDB data loading: When user provides a DuckDB path, load data directly using duckdb.connect() with read_only=True. Auto-detect format: OpenAlgo Historify (table market_data, epoch timestamps) vs custom (table ohlcv, date+time columns). See duckdb-data.

Modular Rule Files

Detailed reference for each topic is in rules/:

Rule File	Topic
data-fetching	OpenAlgo (India), yfinance (US), CCXT (Crypto), custom providers, .env setup
simulation-modes	from_signals, from_orders, from_holding, direction types
position-sizing	Amount/Value/Percent/TargetPercent sizing
indicators-signals	TA-Lib indicator reference, signal generation
openalgo-ta-helpers	OpenAlgo ta: exrem, crossover, Supertrend, Donchian, Ichimoku, MAs
stop-loss-take-profit	Fixed SL, TP, trailing stop
parameter-optimization	Broadcasting and loop-based optimization
performance-analysis	Stats, metrics, benchmark comparison, CAGR
plotting	Candlestick (category x-axis), VectorBT plots, custom Plotly
indian-market-costs	Indian market fee model by segment
us-market-costs	US market fee model (stocks, options, futures)
crypto-market-costs	Crypto fee model (spot, USDT-M, COIN-M futures)
futures-backtesting	Lot sizes (SEBI revised Dec 2025), value sizing
long-short-trading	Simultaneous long/short, direction comparison
duckdb-data	DuckDB direct loading, Historify format, auto-detect, resampling, multi-symbol
csv-data-resampling	Loading CSV, resampling with Indian market alignment
walk-forward	Walk-forward analysis, WFE ratio
robustness-testing	Monte Carlo, noise test, parameter sensitivity, delay test
pitfalls	Common mistakes and checklist before going live
strategy-catalog	Strategy reference with code snippets
quantstats-tearsheet	QuantStats HTML reports, metrics, plots, Monte Carlo

Strategy Templates (in rules/assets/)

Production-ready scripts with realistic fees, NIFTY benchmark, comparison table, and plain-language report:

Template	Path	Description
EMA Crossover	`assets/ema_crossover/backtest.py`	EMA 10/20 crossover
RSI	`assets/rsi/backtest.py`	RSI(14) oversold/overbought
Donchian	`assets/donchian/backtest.py`	Donchian channel breakout
Supertrend	`assets/supertrend/backtest.py`	Supertrend with intraday sessions
MACD	`assets/macd/backtest.py`	MACD signal-candle breakout
SDA2	`assets/sda2/backtest.py`	SDA2 trend following
Momentum	`assets/momentum/backtest.py`	Double momentum (MOM + MOM-of-MOM)
Dual Momentum	`assets/dual_momentum/backtest.py`	Quarterly ETF rotation
Buy & Hold	`assets/buy_hold/backtest.py`	Static multi-asset allocation
RSI Accumulation	`assets/rsi_accumulation/backtest.py`	Weekly RSI slab-wise accumulation
Walk-Forward	`assets/walk_forward/template.py`	Walk-forward analysis template
Realistic Costs	`assets/realistic_costs/template.py`	Transaction cost impact comparison

Quick Template: Standard Backtest Script

import os
from datetime import datetime, timedelta
from pathlib import Path

import numpy as np
import pandas as pd
import talib as tl
import vectorbt as vbt
from dotenv import find_dotenv, load_dotenv
from openalgo import api, ta

# --- Config ---
script_dir = Path(__file__).resolve().parent
load_dotenv(find_dotenv(), override=False)

SYMBOL = "SBIN"
EXCHANGE = "NSE"
INTERVAL = "D"
INIT_CASH = 1_000_000
FEES = 0.00111              # Indian delivery equity (STT + statutory)
FIXED_FEES = 20             # Rs 20 per order
ALLOCATION = 0.75
BENCHMARK_SYMBOL = "NIFTY"
BENCHMARK_EXCHANGE = "NSE_INDEX"

# --- Fetch Data ---
client = api(
    api_key=os.getenv("OPENALGO_API_KEY"),
    host=os.getenv("OPENALGO_HOST", "http://127.0.0.1:5000"),
)

end_date = datetime.now().date()
start_date = end_date - timedelta(days=365 * 3)

df = client.history(
    symbol=SYMBOL, exchange=EXCHANGE, interval=INTERVAL,
    start_date=start_date.strftime("%Y-%m-%d"),
    end_date=end_date.strftime("%Y-%m-%d"),
)
if "timestamp" in df.columns:
    df["timestamp"] = pd.to_datetime(df["timestamp"])
    df = df.set_index("timestamp")
else:
    df.index = pd.to_datetime(df.index)
df = df.sort_index()
if df.index.tz is not None:
    df.index = df.index.tz_convert(None)

close = df["close"]

# --- Strategy: EMA Crossover (TA-Lib) ---
ema_fast = pd.Series(tl.EMA(close.values, timeperiod=10), index=close.index)
ema_slow = pd.Series(tl.EMA(close.values, timeperiod=20), index=close.index)

buy_raw = (ema_fast > ema_slow) & (ema_fast.shift(1) <= ema_slow.shift(1))
sell_raw = (ema_fast < ema_slow) & (ema_fast.shift(1) >= ema_slow.shift(1))

entries = ta.exrem(buy_raw.fillna(False), sell_raw.fillna(False))
exits = ta.exrem(sell_raw.fillna(False), buy_raw.fillna(False))

# --- Backtest ---
pf = vbt.Portfolio.from_signals(
    close, entries, exits,
    init_cash=INIT_CASH, size=ALLOCATION, size_type="percent",
    fees=FEES, fixed_fees=FIXED_FEES, direction="longonly",
    min_size=1, size_granularity=1, freq="1D",
)

# --- Benchmark ---
df_bench = client.history(
    symbol=BENCHMARK_SYMBOL, exchange=BENCHMARK_EXCHANGE, interval=INTERVAL,
    start_date=start_date.strftime("%Y-%m-%d"),
    end_date=end_date.strftime("%Y-%m-%d"),
)
if "timestamp" in df_bench.columns:
    df_bench["timestamp"] = pd.to_datetime(df_bench["timestamp"])
    df_bench = df_bench.set_index("timestamp")
else:
    df_bench.index = pd.to_datetime(df_bench.index)
df_bench = df_bench.sort_index()
if df_bench.index.tz is not None:
    df_bench.index = df_bench.index.tz_convert(None)
bench_close = df_bench["close"].reindex(close.index).ffill().bfill()
pf_bench = vbt.Portfolio.from_holding(bench_close, init_cash=INIT_CASH, fees=FEES, freq="1D")

# --- Results ---
print(pf.stats())

# --- Strategy vs Benchmark ---
comparison = pd.DataFrame({
    "Strategy": [
        f"{pf.total_return() * 100:.2f}%", f"{pf.sharpe_ratio():.2f}",
        f"{pf.sortino_ratio():.2f}", f"{pf.max_drawdown() * 100:.2f}%",
        f"{pf.trades.win_rate() * 100:.1f}%", f"{pf.trades.count()}",
        f"{pf.trades.profit_factor():.2f}",
    ],
    f"Benchmark ({BENCHMARK_SYMBOL})": [
        f"{pf_bench.total_return() * 100:.2f}%", f"{pf_bench.sharpe_ratio():.2f}",
        f"{pf_bench.sortino_ratio():.2f}", f"{pf_bench.max_drawdown() * 100:.2f}%",
        "-", "-", "-",
    ],
}, index=["Total Return", "Sharpe Ratio", "Sortino Ratio", "Max Drawdown",
          "Win Rate", "Total Trades", "Profit Factor"])
print(comparison.to_string())

# --- Explain ---
print(f"* Total Return: {pf.total_return() * 100:.2f}% vs NIFTY {pf_bench.total_return() * 100:.2f}%")
print(f"* Max Drawdown: {pf.max_drawdown() * 100:.2f}%")
print(f"  -> On Rs {INIT_CASH:,}, worst temporary loss = Rs {abs(pf.max_drawdown()) * INIT_CASH:,.0f}")

# --- Plot ---
fig = pf.plot(subplots=['value', 'underwater', 'cum_returns'], template="plotly_dark")
fig.show()

# --- Export ---
pf.positions.records_readable.to_csv(script_dir / f"{SYMBOL}_trades.csv", index=False)

Quick Template: DuckDB Backtest Script

import datetime as dt
from pathlib import Path

import duckdb
import numpy as np
import pandas as pd
import talib as tl
import vectorbt as vbt

try:
    from openalgo import ta
    exrem = ta.exrem
except ImportError:
    def exrem(signal1, signal2):
        result = signal1.copy()
        active = False
        for i in range(len(signal1)):
            if active:
                result.iloc[i] = False
            if signal1.iloc[i] and not active:
                active = True
            if signal2.iloc[i]:
                active = False
        return result

# --- Config ---
SYMBOL = "SBIN"
DB_PATH = r"path/to/market_data.duckdb"
INIT_CASH = 1_000_000
FEES = 0.000225              # Intraday equity
FIXED_FEES = 20

# --- Load from DuckDB ---
con = duckdb.connect(DB_PATH, read_only=True)
df = con.execute("""
    SELECT date, time, open, high, low, close, volume
    FROM ohlcv WHERE symbol = ? ORDER BY date, time
""", [SYMBOL]).fetchdf()
con.close()

df["datetime"] = pd.to_datetime(df["date"].astype(str) + " " + df["time"].astype(str))
df = df.set_index("datetime").sort_index()
df = df.drop(columns=["date", "time"])

# --- Resample to 5min ---
df_5m = df.resample("5min", origin="start_day", offset="9h15min",
                     label="right", closed="right").agg({
    "open": "first", "high": "max", "low": "min", "close": "last", "volume": "sum"
}).dropna()
close = df_5m["close"]

# --- Strategy + Backtest (same as OpenAlgo template) ---

vectorbt-expert

VectorBT Backtesting Expert Skill

Environment

Critical Rules

Modular Rule Files

Strategy Templates (in rules/assets/)

Quick Template: Standard Backtest Script

Quick Template: DuckDB Backtest Script