Algo Descriptors

Algo descriptors define the algorithms that will run in your Datafye environment. They specify the container image, parameters, resource requirements, and optimization space for your trading strategies.

Purpose

Algo descriptors tell Datafye:

  • What container image hosts your algo

  • What parameters configure your algo's behavior

  • What parameter space to explore during optimization

  • What resources your algo needs (CPU, memory, GPU)

  • What inputs your algo consumes (data feeds, signals)

  • What outputs your algo produces (signals, metrics, trades)

Based on your algo descriptor, Datafye provisions the Algo Container runtime and configures it to execute your strategy.

When You Need Algo Descriptors

Algo descriptors are required for Full Stack scenarios only:

  • Foundry: Full Stack — Defines algos for backtesting and optimization

  • Trading: Full Stack — Configures algos for paper and live trading

For Data Cloud Only and Data Cloud + Broker scenarios, you provide your own containers and don't use algo descriptors.

Structure

circle-exclamation

A typical algo descriptor includes:

Basic Configuration

Algo Name

A unique identifier for your algo:

Best practices:

  • Use descriptive names that indicate strategy type

  • Include version in name for clarity (momentum-v2)

  • Use lowercase with hyphens (my-algo, not MyAlgo or my_algo)

Container Image

The Docker image containing your algo implementation:

Image requirements:

  • Built with Datafye Algo Container SDK

  • Implements required SDK interfaces

  • Includes all dependencies

  • Tagged with specific version (not latest)

Registry options:

  • Docker Hub: username/image:tag

  • Private registry: registry.example.com/image:tag

  • [Does Datafye provide a container registry?]

Parameters

Fixed parameter values that configure your algo:

Parameter types:

  • Numeric: integers and floats

  • Strings: text values

  • Booleans: true/false flags

  • [Arrays/lists supported?]

  • [Nested objects supported?]

Use cases:

  • Strategy configuration (lookback periods, thresholds)

  • Risk management (stop loss, position sizing)

  • Execution settings (order types, time in force)

  • Feature flags (enable/disable components)

Parameter Space (For Optimization)

The parameter space defines ranges for genetic algorithm-based optimization:

Range Specification

circle-exclamation

Continuous ranges:

Discrete values:

Categorical parameters:

Optimization Strategy

The genetic algorithm explores the parameter space to find optimal combinations:

circle-exclamation

Optimization objectives:

  • sharpe_ratio — Maximize risk-adjusted returns

  • total_return — Maximize absolute returns

  • sortino_ratio — Maximize downside-adjusted returns

  • calmar_ratio — Maximize return/drawdown ratio

  • [Other objectives?]

  • [Multi-objective optimization supported?]

Constraints:

  • [Can you set constraints like max_drawdown < 0.2?]

  • [Can you constrain parameter relationships?]

  • [Can you set resource limits per backtest?]

Resource Requirements

Specify compute resources your algo needs:

CPU Allocation

Considerations:

  • More CPUs = faster backtesting (if algo is parallelizable)

  • Optimize based on actual usage

  • [Cost implications of CPU allocation?]

Memory Allocation

Considerations:

  • Depends on data volume and algo complexity

  • Monitor actual usage and adjust

  • Out-of-memory = algo crashes

  • [Memory limits enforced strictly?]

GPU Support

Use cases:

  • Machine learning models

  • Large-scale computations

  • Real-time inference

Considerations:

  • [Are GPUs available in all deployment types?]

  • [Significant cost implications?]

  • Not all algos need GPUs

Inputs and Outputs

Define what data your algo consumes and produces:

circle-exclamation

Inputs

Input types:

  • Market data (OHLCV, quotes, trades)

  • Signals from other algos (composability)

  • Alternative data (sentiment, fundamentals)

  • [Custom data sources?]

Outputs

Output types:

  • Signals (for other algos or visualization)

  • Trades (actual orders)

  • Metrics (for monitoring and analysis)

  • [Alerts/notifications?]

Example Algo Descriptors

Example 1: Simple Momentum Strategy

Basic momentum algo with fixed parameters:

Example 2: Optimizable Strategy

Momentum strategy with parameter space for optimization:

Example 3: ML-Based Strategy

Machine learning algo requiring GPU:

Example 4: Composable Strategy

Algo that consumes signals from other algos:

SDK Integration

Algo descriptors work hand-in-hand with the Datafye Algo Container SDK:

Parameter Access

Your SDK-based algo accesses parameters defined in the descriptor:

Resource Utilization

The SDK ensures your algo stays within resource limits:

Input/Output Handling

The SDK provides APIs for consuming inputs and producing outputs:

Validation

Before provisioning, Datafye validates your algo descriptor:

Schema Validation

  • All required fields present

  • Valid parameter types and values

  • Resource requirements within limits

  • Image reference format correct

Image Validation

  • Container image exists and is accessible

  • Image built with compatible SDK version

  • Image passes basic health checks

  • [Size limits on images?]

Parameter Validation

  • Parameters match algo's expected config

  • Parameter space ranges are valid

  • No conflicting parameter combinations

  • [Does SDK declare expected parameters for validation?]

Resource Validation

  • Resource requests are reasonable

  • Resources available in deployment environment

  • [Maximum resource limits?]

Common validation errors:

  • Container image not found or inaccessible

  • Invalid parameter types

  • Parameter space range errors

  • Resource requests exceed limits

  • [Other common errors?]

Versioning and Updates

Algo Versioning

Use image tags to version your algos:

Best practices:

  • Use semantic versioning (v1.0.0, v1.1.0, v2.0.0)

  • Never use latest tag in production

  • Tag images with git commit hash for traceability

  • Document breaking changes between versions

Updating Algos

To update an algo in a running environment:

  1. Build new container image with changes

  2. Tag with new version

  3. Update algo descriptor with new image reference

  4. Re-provision or restart environment

[Does Datafye support hot-swapping algos without full reprovision?]

Development Workflow

Local Development

  1. Develop algo using SDK

    • Implement strategy logic

    • Define parameters and inputs/outputs

    • Test locally with sample data

  2. Containerize

    • Write Dockerfile

    • Build image: docker build -t myrepo/algo:dev

    • Test container locally

  3. Create descriptor

    • Define parameters and resource requirements

    • Specify parameter space if optimizing

    • Validate descriptor syntax

  4. Test in Datafye

    • Push image to registry

    • Provision Foundry environment

    • Run backtests

    • Iterate on parameters and code

Optimization Workflow

  1. Define parameter space in descriptor

  2. Run genetic algorithm optimization

  3. Review results and scorecards

  4. Select optimal parameters

  5. Update descriptor with optimal values

  6. Validate with out-of-sample backtest

Production Deployment

  1. Finalize parameters based on optimization

  2. Tag stable image version

  3. Update descriptor with production config

  4. Provision Trading environment

  5. Paper trade to validate

  6. Transition to live trading

Best Practices

Descriptor Organization

Keep separate descriptors for different stages:

Development (algo-dev.yaml):

Optimization (algo-optimize.yaml):

Production (algo-prod.yaml):

Version Control

  • Store descriptors with algo source code

  • Tag descriptor versions with algo releases

  • Document parameter changes in commit messages

  • Use meaningful branch names for experiments

Documentation

Document your algo descriptor:

Testing

Before production:

  • Test with minimal resources first

  • Validate parameter ranges make sense

  • Run backtests with various parameter values

  • Verify outputs match expectations

  • Check resource usage matches allocation

Next Steps


Last updated: 2025-10-11

Last updated