Using Own Algo Container

When using your own containerized algos in a Foundry: Data Cloud Only environment, you drive the backtesting workflow programmatically via REST APIs. This approach gives you complete control over how backtests are executed, allowing you to integrate with your existing testing frameworks and methodologies.

Overview

The backtest workflow in Data Cloud Only foundries consists of four stages:

  1. Fetch - Download historical tick data for your symbols and date range

  2. Reset - Clear environment state from any previous backtest

  3. Replay - Start tick replay, which drives your algo as if in live

  4. Analyze - Collect and analyze results using your own metrics

All stages are controlled via REST API calls, giving you complete flexibility in orchestrating the backtest.

Backtest Workflow

Stage 1: Fetch Historical Tick Data

First, fetch the historical tick data you need for your backtest. Tick history includes both trades and quotes.

# Fetch tick history for a single day
curl -X POST "http://localhost:8080/datafye-api/v1/backtest/history/ticks/fetch/start?dataset=SIP&date=2024-12-01&symbols=AAPL,MSFT,GOOGL" \
  -H "Content-Type: application/json"

# Response: {"status": null}  # null status means success

Parameters:

  • dataset - The dataset to fetch (e.g., SIP)

  • date - The date to fetch (YYYY-MM-DD)

  • symbols - Comma-separated list of symbols (optional - fetches all if omitted)

  • numDays - Number of days to fetch (default: 1)

  • sortByExchangeTs - Sort by exchange timestamp (default: true)

  • noTrades - Exclude trades, fetch only quotes (default: false)

  • noQuotes - Exclude quotes, fetch only trades (default: false)

Monitor fetch progress:

Stop fetch if needed:

Fetch Time: Fetching tick history can take several minutes depending on the number of symbols and data volume. Monitor the status endpoint to know when fetching is complete.

Stage 2: Reset Environment State

Before starting a new backtest, clear out any state from previous runs:

This ensures:

  • All tick buffers are cleared across all datasets

  • Aggregate data is reset

  • Your algo starts with a clean slate

Stage 3: Start Tick Replay

Start replaying the fetched tick history. The replay drives your algo identically to how it would operate in live trading:

Parameters:

  • dataset - The dataset to replay

  • date - The date to replay (YYYY-MM-DD)

  • rateType - Exact (actual speed) or MultipleOfActual (accelerated)

  • rate - Speed multiplier when rateType=MultipleOfActual (default: 1)

  • startTime - Start time within day (HH:MM:SS, optional)

  • duration - Replay duration (HH:MM:SS, optional)

Monitor replay:

Stop replay:

Stage 4: Analyze Results

While replay is running (or after it completes), collect and analyze backtest results:

  • Track orders/fills - Monitor your algo's trading activity

  • Calculate P&L - Compute profit and loss from executed trades

  • Measure performance - Calculate metrics like Sharpe ratio, max drawdown

  • Export data - Save results for further analysis

Since you're using your own container, you implement your own result collection and analysis logic.

Example: Complete Backtest Script

Here's a Python script that orchestrates a complete backtest:

Best Practices

Iterate on Speed

Start with faster replay speeds (10x, 50x) for quick iteration, then run final validation at actual speed (Exact).

Fetch Once, Test Many

Fetch tick history once, then run multiple backtests with different parameters by:

  1. Reset state

  2. Update algo configuration

  3. Replay same data

Monitor Memory

Tick replay can be memory-intensive. Monitor your container's memory usage and adjust resources if needed.

Validate Against Live

After backtesting, run paper trading to validate that backtest behavior matches live behavior.

Save Configurations

Document the exact parameters (symbols, dates, replay speed) used for each backtest for reproducibility.

Limitations

Single Backtest at a Time

Unlike Full Stack foundries, Data Cloud Only environments run one backtest at a time. For parallelized backtesting, use Foundry: Full Stack.

Manual Orchestration

You're responsible for managing the backtest workflow. For automated backtesting with built-in optimization, use Foundry: Full Stack.

Custom Metrics

You must implement your own performance metrics and scorecarding. For comprehensive built-in metrics, use Foundry: Full Stack.

Troubleshooting

Fetch times out

  • Reduce number of symbols or date range

  • Check data provider credentials

  • Verify sufficient disk space

Replay doesn't start

  • Ensure fetch completed successfully

  • Verify state was cleared

  • Check that date matches fetched data

Algo doesn't receive ticks

  • Verify algo is subscribed to the correct symbols

  • Check that replay is actually running

  • Review algo logs for connection issues

Next Steps

Last updated