Using Datafye Algo Container

When using Datafye's Algo Container in a Foundry: Full Stack environment, the integrated Backtesting Engine automates the entire backtesting workflow. You configure backtests declaratively in your algo descriptor, and the engine handles fetching data, managing state, running replays, and generating comprehensive performance scorecards.

Overview

The Backtesting Engine provides:

  • Automated Workflow - Fetch, reset, replay, and analyze without manual orchestration

  • Parallelized Execution - Run hundreds of backtests simultaneously

  • Genetic Algorithm Optimization - Automatically discover optimal parameter combinations

  • Comprehensive Scorecarding - Industry-standard performance metrics out of the box

  • Walk-Forward Analysis - Test robustness across different time periods

  • Out-of-Sample Validation - Prevent overfitting with proper validation

Quick Start

1. Configure Backtest Parameters

Define backtest configuration in your algo descriptor:

algo:
  name: my-momentum-strategy

  parameters:
    lookback_period: 20
    entry_threshold: 0.02
    exit_threshold: -0.01

  backtest:
    start_date: "2024-01-01"
    end_date: "2024-12-31"
    symbols:
      - AAPL
      - MSFT
      - GOOGL
    initial_capital: 100000
    commission: 0.001  # 10 basis points

2. Run Backtest

The Backtesting Engine:

  1. Fetches required tick history

  2. Resets environment state

  3. Replays ticks to drive your algo

  4. Collects trades and performance data

  5. Generates comprehensive scorecard

3. View Results

Scorecard includes:

  • Total return, annualized return

  • Sharpe ratio, Sortino ratio

  • Maximum drawdown, drawdown duration

  • Win rate, profit factor

  • Trade statistics

  • Equity curve visualization

Backtest Configuration

Date Ranges

Specify the backtest period:

Or use relative dates:

Symbol Universe

Test on specific symbols:

Or test on an entire universe:

Capital and Costs

Configure initial capital and transaction costs:

Execution Simulation

Control how orders are filled:

Parallelized Backtesting

Test multiple parameter combinations simultaneously. See Parallelized Backtesting for details.

Example:

Run all combinations in parallel:

The engine distributes backtests across available cores, dramatically reducing iteration time.

Genetic Algorithm Optimization

Automatically discover optimal parameter combinations using genetic algorithms. See Genetic Algorithm Based Optimization for details.

Example:

Run optimization:

The genetic algorithm:

  1. Generates initial population of parameter sets

  2. Evaluates fitness (Sharpe ratio) via backtesting

  3. Selects best performers

  4. Breeds new generations through crossover and mutation

  5. Converges on optimal parameters

Walk-Forward Analysis

Test robustness by training on one period and validating on another:

This prevents overfitting by ensuring parameters work on unseen data.

Out-of-Sample Validation

Reserve a portion of data for final validation:

Workflow:

  1. Optimize parameters on train period

  2. Lock in best parameters

  3. Validate on out-of-sample period (no optimization)

  4. Compare metrics between periods

Analyzing Results

Scorecard Metrics

The Backtesting Engine generates comprehensive performance metrics:

Returns:

  • Total return (%)

  • Annualized return (%)

  • Cumulative returns by period

Risk-Adjusted:

  • Sharpe ratio (return/volatility)

  • Sortino ratio (return/downside volatility)

  • Calmar ratio (return/max drawdown)

  • Information ratio

Drawdown:

  • Maximum drawdown (%)

  • Average drawdown (%)

  • Maximum drawdown duration (days)

  • Recovery time

Trading:

  • Total trades

  • Win rate (%)

  • Profit factor (gross profit/gross loss)

  • Average win/loss

  • Longest winning/losing streak

Exposure:

  • Time in market (%)

  • Average position size

  • Maximum position size

Equity Curve

Visualize performance over time:

Shows:

  • Equity growth over time

  • Drawdown periods highlighted

  • Major wins and losses marked

Trade Analysis

Examine individual trades:

View:

  • Entry/exit prices and times

  • Holding period

  • P&L per trade

  • Trade metadata

Best Practices

Start Simple

Begin with a single backtest on a small date range and limited symbols:

Verify the backtest runs correctly before expanding scope.

Realistic Costs

Always include commissions and slippage:

Backtests without costs are overly optimistic and won't match live performance.

Multiple Time Periods

Test across different market conditions:

Strategies that work across different regimes are more robust.

Avoid Overfitting

Red flags for overfitting:

  • Too many parameters relative to data

  • Perfect or near-perfect backtest performance

  • Large performance gap between train and test periods

  • Highly sensitive to small parameter changes

Mitigation:

  • Use out-of-sample validation

  • Prefer simpler strategies

  • Test on multiple time periods

  • Verify results with walk-forward analysis

Validate with Paper Trading

After backtesting, always paper trade before live:

  1. Run backtest to validate strategy

  2. Deploy to paper trading environment

  3. Compare paper trading results to backtest

  4. Investigate discrepancies

  5. Only proceed to live after paper validates backtest

Advanced Features

Custom Metrics

Define your own performance metrics:

Monte Carlo Simulation

Test strategy robustness with randomized scenarios:

Strategy Comparison

Compare multiple strategies side-by-side:

Limitations and Considerations

Survivorship Bias

Backtests use current symbol lists. Delisted companies aren't included, which can inflate returns.

Lookahead Bias

Ensure your algo doesn't use future information. The Backtesting Engine prevents this by default, but custom logic should be carefully reviewed.

Microstructure

Tick replay simulates market microstructure, but perfect replication is impossible. Large orders may execute differently in backtests vs. live.

Data Quality

Backtest quality depends on data quality. Use high-quality data providers for accurate results.

Troubleshooting

Backtest never completes

  • Check date range isn't too large

  • Verify symbols exist in dataset

  • Review algo logs for errors

Unrealistic results

  • Verify commission and slippage are included

  • Check for lookahead bias in algo logic

  • Ensure orders use realistic limit prices

Different results each run

  • Check if algo uses randomness without seeding

  • Verify data hasn't changed

  • Review any non-deterministic logic

Next Steps

Last updated