Using Datafye Algo Container

When using Datafye's Algo Container in a Foundry: Full Stack environment, the integrated Backtesting Engine automates the entire backtesting workflow. You configure backtests declaratively in your algo descriptor, and the engine handles fetching data, managing state, running replays, and generating comprehensive performance scorecards.

Overview

The Backtesting Engine provides:

Automated Workflow - Fetch, reset, replay, and analyze without manual orchestration
Parallelized Execution - Run hundreds of backtests simultaneously
Genetic Algorithm Optimization - Automatically discover optimal parameter combinations
Comprehensive Scorecarding - Industry-standard performance metrics out of the box
Walk-Forward Analysis - Test robustness across different time periods
Out-of-Sample Validation - Prevent overfitting with proper validation

Quick Start

1. Configure Backtest Parameters

Define backtest configuration in your algo descriptor:

algo:
  name: my-momentum-strategy

  parameters:
    lookback_period: 20
    entry_threshold: 0.02
    exit_threshold: -0.01

  backtest:
    start_date: "2024-01-01"
    end_date: "2024-12-31"
    symbols:
      - AAPL
      - MSFT
      - GOOGL
    initial_capital: 100000
    commission: 0.001  # 10 basis points

2. Run Backtest

datafye foundry backtest run --algo my-momentum-strategy

The Backtesting Engine:

Fetches required tick history
Resets environment state
Replays ticks to drive your algo
Collects trades and performance data
Generates comprehensive scorecard

3. View Results

datafye foundry backtest show bt-20250127-143022

Scorecard includes:

Total return, annualized return
Sharpe ratio, Sortino ratio
Maximum drawdown, drawdown duration
Win rate, profit factor
Trade statistics
Equity curve visualization

Backtest Configuration

Date Ranges

Specify the backtest period:

backtest:
  start_date: "2024-01-01"  # Inclusive start date
  end_date: "2024-12-31"    # Inclusive end date

Or use relative dates:

backtest:
  lookback_days: 365  # Last 365 days from today

Symbol Universe

Test on specific symbols:

backtest:
  symbols:
    - AAPL
    - MSFT
    - GOOGL

Or test on an entire universe:

backtest:
  universes:
    - SP500

Capital and Costs

Configure initial capital and transaction costs:

backtest:
  initial_capital: 100000
  commission: 0.001       # Commission per share ($ or %)
  slippage: 0.0005       # Slippage per share

Execution Simulation

Control how orders are filled:

backtest:
  fill_model: realistic   # realistic, optimistic, or pessimistic
  market_impact: true     # Model market impact for large orders

Parallelized Backtesting

Test multiple parameter combinations simultaneously. See Parallelized Backtesting for details.

Example:

backtest:
  parameter_sweep:
    lookback_period: [10, 20, 30, 50]
    entry_threshold: [0.01, 0.02, 0.03]
    exit_threshold: [-0.005, -0.01, -0.015]

  # This creates 4 × 3 × 3 = 36 backtests

Run all combinations in parallel:

datafye foundry backtest run --algo my-momentum-strategy --parallel

The engine distributes backtests across available cores, dramatically reducing iteration time.

Genetic Algorithm Optimization

Automatically discover optimal parameter combinations using genetic algorithms. See Genetic Algorithm Based Optimization for details.

Example:

backtest:
  optimization:
    method: genetic_algorithm
    objective: sharpe_ratio  # Maximize Sharpe ratio

    parameters:
      lookback_period:
        type: integer
        min: 5
        max: 100
      entry_threshold:
        type: float
        min: 0.005
        max: 0.05
      exit_threshold:
        type: float
        min: -0.03
        max: -0.001

    population_size: 50
    generations: 100
    mutation_rate: 0.1

Run optimization:

datafye foundry backtest optimize --algo my-momentum-strategy

The genetic algorithm:

Generates initial population of parameter sets
Evaluates fitness (Sharpe ratio) via backtesting
Selects best performers
Breeds new generations through crossover and mutation
Converges on optimal parameters

Walk-Forward Analysis

Test robustness by training on one period and validating on another:

backtest:
  walk_forward:
    train_period: 180  # Train on 180 days
    test_period: 30    # Test on next 30 days
    step: 30           # Slide window by 30 days

This prevents overfitting by ensuring parameters work on unseen data.

Out-of-Sample Validation

Reserve a portion of data for final validation:

backtest:
  train_period:
    start_date: "2024-01-01"
    end_date: "2024-09-30"

  out_of_sample_period:
    start_date: "2024-10-01"
    end_date: "2024-12-31"

Workflow:

Optimize parameters on train period
Lock in best parameters
Validate on out-of-sample period (no optimization)
Compare metrics between periods

Analyzing Results

Scorecard Metrics

The Backtesting Engine generates comprehensive performance metrics:

Returns:

Total return (%)
Annualized return (%)
Cumulative returns by period

Risk-Adjusted:

Sharpe ratio (return/volatility)
Sortino ratio (return/downside volatility)
Calmar ratio (return/max drawdown)
Information ratio

Drawdown:

Maximum drawdown (%)
Average drawdown (%)
Maximum drawdown duration (days)
Recovery time

Trading:

Total trades
Win rate (%)
Profit factor (gross profit/gross loss)
Average win/loss
Longest winning/losing streak

Exposure:

Time in market (%)
Average position size
Maximum position size

Equity Curve

Visualize performance over time:

datafye foundry backtest show bt-20250127-143022 --chart equity

Shows:

Equity growth over time
Drawdown periods highlighted
Major wins and losses marked

Trade Analysis

Examine individual trades:

datafye foundry backtest show bt-20250127-143022 --trades

View:

Entry/exit prices and times
Holding period
P&L per trade
Trade metadata

Best Practices

Start Simple

Begin with a single backtest on a small date range and limited symbols:

backtest:
  start_date: "2024-12-01"
  end_date: "2024-12-31"
  symbols: [AAPL]

Verify the backtest runs correctly before expanding scope.

Realistic Costs

Always include commissions and slippage:

backtest:
  commission: 0.001    # Realistic for retail
  slippage: 0.0005     # Conservative estimate

Backtests without costs are overly optimistic and won't match live performance.

Multiple Time Periods

Test across different market conditions:

backtest:
  periods:
    - start: "2020-01-01"  # Bull market
      end: "2021-12-31"
    - start: "2022-01-01"  # Bear market
      end: "2022-12-31"
    - start: "2023-01-01"  # Recovery
      end: "2024-12-31"

Strategies that work across different regimes are more robust.

Avoid Overfitting

Red flags for overfitting:

Too many parameters relative to data
Perfect or near-perfect backtest performance
Large performance gap between train and test periods
Highly sensitive to small parameter changes

Mitigation:

Use out-of-sample validation
Prefer simpler strategies
Test on multiple time periods
Verify results with walk-forward analysis

Validate with Paper Trading

After backtesting, always paper trade before live:

Run backtest to validate strategy
Deploy to paper trading environment
Compare paper trading results to backtest
Investigate discrepancies
Only proceed to live after paper validates backtest

Advanced Features

Custom Metrics

Define your own performance metrics:

backtest:
  custom_metrics:
    - name: profit_per_day
      formula: "total_profit / trading_days"
    - name: risk_reward_ratio
      formula: "average_win / abs(average_loss)"

Monte Carlo Simulation

Test strategy robustness with randomized scenarios:

backtest:
  monte_carlo:
    iterations: 1000
    method: bootstrap  # Resample historical returns

Strategy Comparison

Compare multiple strategies side-by-side:

datafye foundry backtest compare bt-20250127-143022 bt-20250127-143045 bt-20250127-143108

Limitations and Considerations

Survivorship Bias

Backtests use current symbol lists. Delisted companies aren't included, which can inflate returns.

Lookahead Bias

Ensure your algo doesn't use future information. The Backtesting Engine prevents this by default, but custom logic should be carefully reviewed.

Microstructure

Tick replay simulates market microstructure, but perfect replication is impossible. Large orders may execute differently in backtests vs. live.

Data Quality

Backtest quality depends on data quality. Use high-quality data providers for accurate results.

Troubleshooting

Backtest never completes

Check date range isn't too large
Verify symbols exist in dataset
Review algo logs for errors

Unrealistic results

Verify commission and slippage are included
Check for lookahead bias in algo logic
Ensure orders use realistic limit prices

Different results each run

Check if algo uses randomness without seeding
Verify data hasn't changed
Review any non-deterministic logic

Next Steps

Parallelized Backtesting - Run multiple backtests simultaneously
Genetic Algorithm Based Optimization - Automate parameter discovery
Scorecarding - Understand performance metrics in depth
Paper Trading Your Algo - Validate backtests with paper trading

PreviousUsing Own Algo Container NextParallelized Backtesting

Last updated 3 months ago

hashtagOverview

hashtagQuick Start

hashtag1. Configure Backtest Parameters

hashtag2. Run Backtest

hashtag3. View Results

hashtagBacktest Configuration

hashtagDate Ranges

hashtagSymbol Universe

hashtagCapital and Costs

hashtagExecution Simulation

hashtagParallelized Backtesting

hashtagGenetic Algorithm Optimization

hashtagWalk-Forward Analysis

hashtagOut-of-Sample Validation

hashtagAnalyzing Results

hashtagScorecard Metrics

hashtagEquity Curve

hashtagTrade Analysis

hashtagBest Practices

hashtagStart Simple

hashtagRealistic Costs

hashtagMultiple Time Periods

hashtagAvoid Overfitting

hashtagValidate with Paper Trading

hashtagAdvanced Features

hashtagCustom Metrics

hashtagMonte Carlo Simulation

hashtagStrategy Comparison

hashtagLimitations and Considerations

hashtagSurvivorship Bias

hashtagLookahead Bias

hashtagMicrostructure

hashtagData Quality

hashtagTroubleshooting

hashtagNext Steps