Using Own Algo Container

When using your own containerized algos in a Foundry: Data Cloud Only environment, you drive the backtesting workflow programmatically via REST APIs. This approach gives you complete control over how backtests are executed, allowing you to integrate with your existing testing frameworks and methodologies.

Overview

The backtest workflow in Data Cloud Only foundries consists of four stages:

Fetch - Download historical tick data for your symbols and date range
Reset - Clear environment state from any previous backtest
Replay - Start tick replay, which drives your algo as if in live
Analyze - Collect and analyze results using your own metrics

All stages are controlled via REST API calls, giving you complete flexibility in orchestrating the backtest.

Backtest Workflow

Stage 1: Fetch Historical Tick Data

First, fetch the historical tick data you need for your backtest. Tick history includes both trades and quotes.

# Fetch tick history for a single day
curl -X POST "http://localhost:8080/datafye-api/v1/backtest/history/ticks/fetch/start?dataset=SIP&date=2024-12-01&symbols=AAPL,MSFT,GOOGL" \
  -H "Content-Type: application/json"

# Response: {"status": null}  # null status means success

Parameters:

dataset - The dataset to fetch (e.g., SIP)
date - The date to fetch (YYYY-MM-DD)
symbols - Comma-separated list of symbols (optional - fetches all if omitted)
numDays - Number of days to fetch (default: 1)
sortByExchangeTs - Sort by exchange timestamp (default: true)
noTrades - Exclude trades, fetch only quotes (default: false)
noQuotes - Exclude quotes, fetch only trades (default: false)

Monitor fetch progress:

# Check if fetch is running
curl "http://localhost:8080/datafye-api/v1/backtest/history/ticks/fetch/status?dataset=SIP"

# Response: {"isRunning": true}  or  {"isRunning": false}

Stop fetch if needed:

curl -X POST "http://localhost:8080/datafye-api/v1/backtest/history/ticks/fetch/stop?dataset=SIP"

Fetch Time: Fetching tick history can take several minutes depending on the number of symbols and data volume. Monitor the status endpoint to know when fetching is complete.

Stage 2: Reset Environment State

Before starting a new backtest, clear out any state from previous runs:

curl -X POST "http://localhost:8080/datafye-api/v1/backtest/state/clear"

This ensures:

All tick buffers are cleared across all datasets
Aggregate data is reset
Your algo starts with a clean slate

Always reset between backtests to avoid contaminating results with state from previous runs.

Stage 3: Start Tick Replay

Start replaying the fetched tick history. The replay drives your algo identically to how it would operate in live trading:

# Start replay at actual speed
curl -X POST "http://localhost:8080/datafye-api/v1/backtest/history/ticks/replay/start?dataset=SIP&date=2024-12-01&rateType=Exact"

# Or replay at 10x speed
curl -X POST "http://localhost:8080/datafye-api/v1/backtest/history/ticks/replay/start?dataset=SIP&date=2024-12-01&rateType=MultipleOfActual&rate=10"

Parameters:

dataset - The dataset to replay
date - The date to replay (YYYY-MM-DD)
rateType - Exact (actual speed) or MultipleOfActual (accelerated)
rate - Speed multiplier when rateType=MultipleOfActual (default: 1)
startTime - Start time within day (HH:MM:SS, optional)
duration - Replay duration (HH:MM:SS, optional)

Monitor replay:

# Check if replay is running
curl "http://localhost:8080/datafye-api/v1/backtest/history/ticks/replay/status?dataset=SIP"

# Response: {"isRunning": true}

Stop replay:

curl -X POST "http://localhost:8080/datafye-api/v1/backtest/history/ticks/replay/stop?dataset=SIP"

Stage 4: Analyze Results

While replay is running (or after it completes), collect and analyze backtest results:

Track orders/fills - Monitor your algo's trading activity
Calculate P&L - Compute profit and loss from executed trades
Measure performance - Calculate metrics like Sharpe ratio, max drawdown
Export data - Save results for further analysis

Since you're using your own container, you implement your own result collection and analysis logic.

Example: Complete Backtest Script

Here's a Python script that orchestrates a complete backtest:

import requests
import time

BASE_URL = "http://localhost:8080/datafye-api/v1"
DATASET = "SIP"
DATE = "2024-12-01"
SYMBOLS = "AAPL,MSFT,GOOGL"

def wait_for_fetch():
    """Poll until fetch completes"""
    while True:
        resp = requests.get(f"{BASE_URL}/backtest/history/ticks/fetch/status?dataset={DATASET}")
        if not resp.json()["isRunning"]:
            break
        print("Fetching tick history...")
        time.sleep(5)

def wait_for_replay():
    """Poll until replay completes"""
    while True:
        resp = requests.get(f"{BASE_URL}/backtest/history/ticks/replay/status?dataset={DATASET}")
        if not resp.json()["isRunning"]:
            break
        print("Replay running...")
        time.sleep(1)

# 1. Fetch tick history
print(f"Fetching tick history for {DATE}...")
requests.post(f"{BASE_URL}/backtest/history/ticks/fetch/start",
              params={"dataset": DATASET, "date": DATE, "symbols": SYMBOLS})
wait_for_fetch()
print("Fetch complete!")

# 2. Reset environment
print("Resetting environment...")
requests.post(f"{BASE_URL}/backtest/state/clear")
print("Environment reset!")

# 3. Start replay at 10x speed
print("Starting replay at 10x speed...")
requests.post(f"{BASE_URL}/backtest/history/ticks/replay/start",
              params={"dataset": DATASET, "date": DATE, "rateType": "MultipleOfActual", "rate": 10})
wait_for_replay()
print("Replay complete!")

# 4. Analyze results (implement your own logic)
print("Analyzing backtest results...")
# Your analysis code here

Best Practices

Iterate on Speed

Start with faster replay speeds (10x, 50x) for quick iteration, then run final validation at actual speed (Exact).

Fetch Once, Test Many

Fetch tick history once, then run multiple backtests with different parameters by:

Reset state
Update algo configuration
Replay same data

Monitor Memory

Tick replay can be memory-intensive. Monitor your container's memory usage and adjust resources if needed.

Validate Against Live

After backtesting, run paper trading to validate that backtest behavior matches live behavior.

Save Configurations

Document the exact parameters (symbols, dates, replay speed) used for each backtest for reproducibility.

Limitations

Single Backtest at a Time

Unlike Full Stack foundries, Data Cloud Only environments run one backtest at a time. For parallelized backtesting, use Foundry: Full Stack.

Manual Orchestration

You're responsible for managing the backtest workflow. For automated backtesting with built-in optimization, use Foundry: Full Stack.

Custom Metrics

You must implement your own performance metrics and scorecarding. For comprehensive built-in metrics, use Foundry: Full Stack.

Troubleshooting

Fetch times out

Reduce number of symbols or date range
Check data provider credentials
Verify sufficient disk space

Replay doesn't start

Ensure fetch completed successfully
Verify state was cleared
Check that date matches fetched data

Algo doesn't receive ticks

Verify algo is subscribed to the correct symbols
Check that replay is actually running
Review algo logs for connection issues

Next Steps

Using Datafye Algo Container - Learn about automated backtesting with the Backtesting Engine
REST API Reference - Complete API documentation
The Backtesting Engine - Understand Datafye's backtesting architecture

PreviousBacktesting Your Algo NextUsing Datafye Algo Container

Last updated 3 months ago

hashtagOverview

hashtagBacktest Workflow

hashtagStage 1: Fetch Historical Tick Data

hashtagStage 2: Reset Environment State

hashtagStage 3: Start Tick Replay

hashtagStage 4: Analyze Results

hashtagExample: Complete Backtest Script

hashtagBest Practices

hashtagIterate on Speed

hashtagFetch Once, Test Many

hashtagMonitor Memory

hashtagValidate Against Live

hashtagSave Configurations

hashtagLimitations

hashtagSingle Backtest at a Time

hashtagManual Orchestration

hashtagCustom Metrics

hashtagTroubleshooting

hashtagNext Steps

Overview

Backtest Workflow

Stage 1: Fetch Historical Tick Data

Stage 2: Reset Environment State

Stage 3: Start Tick Replay

Stage 4: Analyze Results

Example: Complete Backtest Script

Best Practices

Iterate on Speed

Fetch Once, Test Many

Monitor Memory

Validate Against Live

Save Configurations

Limitations

Single Backtest at a Time

Manual Orchestration

Custom Metrics

Troubleshooting

Next Steps