vectorbt-expert

SKILL.md

VectorBT Backtesting Expert Skill

Environment

  • Python with vectorbt, pandas, numpy, plotly
  • Data sources: OpenAlgo (Indian markets), DuckDB (direct database), yfinance (US/Global), CCXT (Crypto), custom providers
  • DuckDB support: supports both custom DuckDB and OpenAlgo Historify format
  • API keys loaded from single root .env via python-dotenv + find_dotenv() — never hardcode keys
  • Technical indicators: TA-Lib (ALWAYS - never use VectorBT built-in indicators)
  • Specialty indicators: openalgo.ta for Supertrend, Donchian, Ichimoku, HMA, KAMA, ALMA, ZLEMA, VWMA
  • Signal cleaning: openalgo.ta for exrem, crossover, crossunder, flip
  • Fee model: Indian market standard (STT + statutory charges + Rs 20/order)
  • Benchmark: NIFTY 50 via OpenAlgo (NSE_INDEX) by default
  • Charts: Plotly with template="plotly_dark"
  • Environment variables loaded from single .env at project root via find_dotenv() (walks up from script dir)
  • Scripts go in backtesting/{strategy_name}/ directories (created on-demand, not pre-created)
  • Never use icons/emojis in code or logger output

Critical Rules

  1. ALWAYS use TA-Lib for ALL technical indicators (EMA, SMA, RSI, MACD, BBANDS, ATR, ADX, STDDEV, MOM). NEVER use vbt.MA.run(), vbt.RSI.run(), or any VectorBT built-in indicator.
  2. Use OpenAlgo ta for indicators NOT in TA-Lib: Supertrend, Donchian, Ichimoku, HMA, KAMA, ALMA, ZLEMA, VWMA.
  3. Use OpenAlgo ta for signal utilities: ta.exrem(), ta.crossover(), ta.crossunder(), ta.flip(). If openalgo.ta is not importable (standalone DuckDB), use inline exrem() fallback. See duckdb-data.
  4. Always clean signals with ta.exrem() after generating raw buy/sell signals. Always .fillna(False) before exrem.
  5. Market-specific fees: India (indian-market-costs), US (us-market-costs), Crypto (crypto-market-costs). Auto-select based on user's market.
  6. Default benchmarks: India=NIFTY via OpenAlgo, US=S&P 500 (^GSPC), Crypto=Bitcoin (BTC-USD). See data-fetching Market Selection Guide.
  7. Always produce a Strategy vs Benchmark comparison table after every backtest.
  8. Always explain the backtest report in plain language so even normal traders understand risk and strength.
  9. Plotly candlestick charts must use xaxis type="category" to avoid weekend gaps.
  10. Whole shares: Always set min_size=1, size_granularity=1 for equities.
  11. DuckDB data loading: When user provides a DuckDB path, load data directly using duckdb.connect() with read_only=True. Auto-detect format: OpenAlgo Historify (table market_data, epoch timestamps) vs custom (table ohlcv, date+time columns). See duckdb-data.

Modular Rule Files

Detailed reference for each topic is in rules/:

Rule File Topic
data-fetching OpenAlgo (India), yfinance (US), CCXT (Crypto), custom providers, .env setup
simulation-modes from_signals, from_orders, from_holding, direction types
position-sizing Amount/Value/Percent/TargetPercent sizing
indicators-signals TA-Lib indicator reference, signal generation
openalgo-ta-helpers OpenAlgo ta: exrem, crossover, Supertrend, Donchian, Ichimoku, MAs
stop-loss-take-profit Fixed SL, TP, trailing stop
parameter-optimization Broadcasting and loop-based optimization
performance-analysis Stats, metrics, benchmark comparison, CAGR
plotting Candlestick (category x-axis), VectorBT plots, custom Plotly
indian-market-costs Indian market fee model by segment
us-market-costs US market fee model (stocks, options, futures)
crypto-market-costs Crypto fee model (spot, USDT-M, COIN-M futures)
futures-backtesting Lot sizes (SEBI revised Dec 2025), value sizing
long-short-trading Simultaneous long/short, direction comparison
duckdb-data DuckDB direct loading, Historify format, auto-detect, resampling, multi-symbol
csv-data-resampling Loading CSV, resampling with Indian market alignment
walk-forward Walk-forward analysis, WFE ratio
robustness-testing Monte Carlo, noise test, parameter sensitivity, delay test
pitfalls Common mistakes and checklist before going live
strategy-catalog Strategy reference with code snippets
quantstats-tearsheet QuantStats HTML reports, metrics, plots, Monte Carlo

Strategy Templates (in rules/assets/)

Production-ready scripts with realistic fees, NIFTY benchmark, comparison table, and plain-language report:

Template Path Description
EMA Crossover assets/ema_crossover/backtest.py EMA 10/20 crossover
RSI assets/rsi/backtest.py RSI(14) oversold/overbought
Donchian assets/donchian/backtest.py Donchian channel breakout
Supertrend assets/supertrend/backtest.py Supertrend with intraday sessions
MACD assets/macd/backtest.py MACD signal-candle breakout
SDA2 assets/sda2/backtest.py SDA2 trend following
Momentum assets/momentum/backtest.py Double momentum (MOM + MOM-of-MOM)
Dual Momentum assets/dual_momentum/backtest.py Quarterly ETF rotation
Buy & Hold assets/buy_hold/backtest.py Static multi-asset allocation
RSI Accumulation assets/rsi_accumulation/backtest.py Weekly RSI slab-wise accumulation
Walk-Forward assets/walk_forward/template.py Walk-forward analysis template
Realistic Costs assets/realistic_costs/template.py Transaction cost impact comparison

Quick Template: Standard Backtest Script

import os
from datetime import datetime, timedelta
from pathlib import Path

import numpy as np
import pandas as pd
import talib as tl
import vectorbt as vbt
from dotenv import find_dotenv, load_dotenv
from openalgo import api, ta

# --- Config ---
script_dir = Path(__file__).resolve().parent
load_dotenv(find_dotenv(), override=False)

SYMBOL = "SBIN"
EXCHANGE = "NSE"
INTERVAL = "D"
INIT_CASH = 1_000_000
FEES = 0.00111              # Indian delivery equity (STT + statutory)
FIXED_FEES = 20             # Rs 20 per order
ALLOCATION = 0.75
BENCHMARK_SYMBOL = "NIFTY"
BENCHMARK_EXCHANGE = "NSE_INDEX"

# --- Fetch Data ---
client = api(
    api_key=os.getenv("OPENALGO_API_KEY"),
    host=os.getenv("OPENALGO_HOST", "http://127.0.0.1:5000"),
)

end_date = datetime.now().date()
start_date = end_date - timedelta(days=365 * 3)

df = client.history(
    symbol=SYMBOL, exchange=EXCHANGE, interval=INTERVAL,
    start_date=start_date.strftime("%Y-%m-%d"),
    end_date=end_date.strftime("%Y-%m-%d"),
)
if "timestamp" in df.columns:
    df["timestamp"] = pd.to_datetime(df["timestamp"])
    df = df.set_index("timestamp")
else:
    df.index = pd.to_datetime(df.index)
df = df.sort_index()
if df.index.tz is not None:
    df.index = df.index.tz_convert(None)

close = df["close"]

# --- Strategy: EMA Crossover (TA-Lib) ---
ema_fast = pd.Series(tl.EMA(close.values, timeperiod=10), index=close.index)
ema_slow = pd.Series(tl.EMA(close.values, timeperiod=20), index=close.index)

buy_raw = (ema_fast > ema_slow) & (ema_fast.shift(1) <= ema_slow.shift(1))
sell_raw = (ema_fast < ema_slow) & (ema_fast.shift(1) >= ema_slow.shift(1))

entries = ta.exrem(buy_raw.fillna(False), sell_raw.fillna(False))
exits = ta.exrem(sell_raw.fillna(False), buy_raw.fillna(False))

# --- Backtest ---
pf = vbt.Portfolio.from_signals(
    close, entries, exits,
    init_cash=INIT_CASH, size=ALLOCATION, size_type="percent",
    fees=FEES, fixed_fees=FIXED_FEES, direction="longonly",
    min_size=1, size_granularity=1, freq="1D",
)

# --- Benchmark ---
df_bench = client.history(
    symbol=BENCHMARK_SYMBOL, exchange=BENCHMARK_EXCHANGE, interval=INTERVAL,
    start_date=start_date.strftime("%Y-%m-%d"),
    end_date=end_date.strftime("%Y-%m-%d"),
)
if "timestamp" in df_bench.columns:
    df_bench["timestamp"] = pd.to_datetime(df_bench["timestamp"])
    df_bench = df_bench.set_index("timestamp")
else:
    df_bench.index = pd.to_datetime(df_bench.index)
df_bench = df_bench.sort_index()
if df_bench.index.tz is not None:
    df_bench.index = df_bench.index.tz_convert(None)
bench_close = df_bench["close"].reindex(close.index).ffill().bfill()
pf_bench = vbt.Portfolio.from_holding(bench_close, init_cash=INIT_CASH, fees=FEES, freq="1D")

# --- Results ---
print(pf.stats())

# --- Strategy vs Benchmark ---
comparison = pd.DataFrame({
    "Strategy": [
        f"{pf.total_return() * 100:.2f}%", f"{pf.sharpe_ratio():.2f}",
        f"{pf.sortino_ratio():.2f}", f"{pf.max_drawdown() * 100:.2f}%",
        f"{pf.trades.win_rate() * 100:.1f}%", f"{pf.trades.count()}",
        f"{pf.trades.profit_factor():.2f}",
    ],
    f"Benchmark ({BENCHMARK_SYMBOL})": [
        f"{pf_bench.total_return() * 100:.2f}%", f"{pf_bench.sharpe_ratio():.2f}",
        f"{pf_bench.sortino_ratio():.2f}", f"{pf_bench.max_drawdown() * 100:.2f}%",
        "-", "-", "-",
    ],
}, index=["Total Return", "Sharpe Ratio", "Sortino Ratio", "Max Drawdown",
          "Win Rate", "Total Trades", "Profit Factor"])
print(comparison.to_string())

# --- Explain ---
print(f"* Total Return: {pf.total_return() * 100:.2f}% vs NIFTY {pf_bench.total_return() * 100:.2f}%")
print(f"* Max Drawdown: {pf.max_drawdown() * 100:.2f}%")
print(f"  -> On Rs {INIT_CASH:,}, worst temporary loss = Rs {abs(pf.max_drawdown()) * INIT_CASH:,.0f}")

# --- Plot ---
fig = pf.plot(subplots=['value', 'underwater', 'cum_returns'], template="plotly_dark")
fig.show()

# --- Export ---
pf.positions.records_readable.to_csv(script_dir / f"{SYMBOL}_trades.csv", index=False)

Quick Template: DuckDB Backtest Script

import datetime as dt
from pathlib import Path

import duckdb
import numpy as np
import pandas as pd
import talib as tl
import vectorbt as vbt

try:
    from openalgo import ta
    exrem = ta.exrem
except ImportError:
    def exrem(signal1, signal2):
        result = signal1.copy()
        active = False
        for i in range(len(signal1)):
            if active:
                result.iloc[i] = False
            if signal1.iloc[i] and not active:
                active = True
            if signal2.iloc[i]:
                active = False
        return result

# --- Config ---
SYMBOL = "SBIN"
DB_PATH = r"path/to/market_data.duckdb"
INIT_CASH = 1_000_000
FEES = 0.000225              # Intraday equity
FIXED_FEES = 20

# --- Load from DuckDB ---
con = duckdb.connect(DB_PATH, read_only=True)
df = con.execute("""
    SELECT date, time, open, high, low, close, volume
    FROM ohlcv WHERE symbol = ? ORDER BY date, time
""", [SYMBOL]).fetchdf()
con.close()

df["datetime"] = pd.to_datetime(df["date"].astype(str) + " " + df["time"].astype(str))
df = df.set_index("datetime").sort_index()
df = df.drop(columns=["date", "time"])

# --- Resample to 5min ---
df_5m = df.resample("5min", origin="start_day", offset="9h15min",
                     label="right", closed="right").agg({
    "open": "first", "high": "max", "low": "min", "close": "last", "volume": "sum"
}).dropna()
close = df_5m["close"]

# --- Strategy + Backtest (same as OpenAlgo template) ---
Weekly Installs
434
GitHub Stars
97
First Seen
Feb 25, 2026
Installed on
opencode428
codex428
cursor422
kimi-cli420
gemini-cli420
github-copilot420