Algo Descriptors
Algo descriptors define the algorithms that will run in your Datafye environment. They specify the container image, parameters, resource requirements, and optimization space for your trading strategies.
Purpose
Algo descriptors tell Datafye:
What container image hosts your algo
What parameters configure your algo's behavior
What parameter space to explore during optimization
What resources your algo needs (CPU, memory, GPU)
What inputs your algo consumes (data feeds, signals)
What outputs your algo produces (signals, metrics, trades)
Based on your algo descriptor, Datafye provisions the Algo Container runtime and configures it to execute your strategy.
When You Need Algo Descriptors
Algo descriptors are required for Full Stack scenarios only:
Foundry: Full Stack — Defines algos for backtesting and optimization
Trading: Full Stack — Configures algos for paper and live trading
For Data Cloud Only and Data Cloud + Broker scenarios, you provide your own containers and don't use algo descriptors.
Structure
TODO: Provide actual YAML/JSON schema for algo descriptors with complete field definitions.
A typical algo descriptor includes:
Basic Configuration
Algo Name
A unique identifier for your algo:
Best practices:
Use descriptive names that indicate strategy type
Include version in name for clarity (
momentum-v2)Use lowercase with hyphens (
my-algo, notMyAlgoormy_algo)
Container Image
The Docker image containing your algo implementation:
Image requirements:
Built with Datafye Algo Container SDK
Implements required SDK interfaces
Includes all dependencies
Tagged with specific version (not
latest)
Registry options:
Docker Hub:
username/image:tagPrivate registry:
registry.example.com/image:tag[Does Datafye provide a container registry?]
Parameters
Fixed parameter values that configure your algo:
Parameter types:
Numeric: integers and floats
Strings: text values
Booleans: true/false flags
[Arrays/lists supported?]
[Nested objects supported?]
Use cases:
Strategy configuration (lookback periods, thresholds)
Risk management (stop loss, position sizing)
Execution settings (order types, time in force)
Feature flags (enable/disable components)
Parameter Space (For Optimization)
The parameter space defines ranges for genetic algorithm-based optimization:
Range Specification
TODO: Document exact syntax for parameter space ranges, step sizes, discrete vs. continuous, and any constraints.
Continuous ranges:
Discrete values:
Categorical parameters:
Optimization Strategy
The genetic algorithm explores the parameter space to find optimal combinations:
TODO: Document optimization configuration options, objectives, and constraints.
Optimization objectives:
sharpe_ratio— Maximize risk-adjusted returnstotal_return— Maximize absolute returnssortino_ratio— Maximize downside-adjusted returnscalmar_ratio— Maximize return/drawdown ratio[Other objectives?]
[Multi-objective optimization supported?]
Constraints:
[Can you set constraints like max_drawdown < 0.2?]
[Can you constrain parameter relationships?]
[Can you set resource limits per backtest?]
Resource Requirements
Specify compute resources your algo needs:
CPU Allocation
Considerations:
More CPUs = faster backtesting (if algo is parallelizable)
Optimize based on actual usage
[Cost implications of CPU allocation?]
Memory Allocation
Considerations:
Depends on data volume and algo complexity
Monitor actual usage and adjust
Out-of-memory = algo crashes
[Memory limits enforced strictly?]
GPU Support
Use cases:
Machine learning models
Large-scale computations
Real-time inference
Considerations:
[Are GPUs available in all deployment types?]
[Significant cost implications?]
Not all algos need GPUs
Inputs and Outputs
Define what data your algo consumes and produces:
TODO: Document the input/output specification format and how it integrates with the SDK.
Inputs
Input types:
Market data (OHLCV, quotes, trades)
Signals from other algos (composability)
Alternative data (sentiment, fundamentals)
[Custom data sources?]
Outputs
Output types:
Signals (for other algos or visualization)
Trades (actual orders)
Metrics (for monitoring and analysis)
[Alerts/notifications?]
Example Algo Descriptors
Example 1: Simple Momentum Strategy
Basic momentum algo with fixed parameters:
Example 2: Optimizable Strategy
Momentum strategy with parameter space for optimization:
Example 3: ML-Based Strategy
Machine learning algo requiring GPU:
Example 4: Composable Strategy
Algo that consumes signals from other algos:
SDK Integration
Algo descriptors work hand-in-hand with the Datafye Algo Container SDK:
Parameter Access
Your SDK-based algo accesses parameters defined in the descriptor:
Resource Utilization
The SDK ensures your algo stays within resource limits:
Input/Output Handling
The SDK provides APIs for consuming inputs and producing outputs:
Validation
Before provisioning, Datafye validates your algo descriptor:
Schema Validation
All required fields present
Valid parameter types and values
Resource requirements within limits
Image reference format correct
Image Validation
Container image exists and is accessible
Image built with compatible SDK version
Image passes basic health checks
[Size limits on images?]
Parameter Validation
Parameters match algo's expected config
Parameter space ranges are valid
No conflicting parameter combinations
[Does SDK declare expected parameters for validation?]
Resource Validation
Resource requests are reasonable
Resources available in deployment environment
[Maximum resource limits?]
Common validation errors:
Container image not found or inaccessible
Invalid parameter types
Parameter space range errors
Resource requests exceed limits
[Other common errors?]
Versioning and Updates
Algo Versioning
Use image tags to version your algos:
Best practices:
Use semantic versioning (v1.0.0, v1.1.0, v2.0.0)
Never use
latesttag in productionTag images with git commit hash for traceability
Document breaking changes between versions
Updating Algos
To update an algo in a running environment:
Build new container image with changes
Tag with new version
Update algo descriptor with new image reference
Re-provision or restart environment
[Does Datafye support hot-swapping algos without full reprovision?]
Development Workflow
Local Development
Develop algo using SDK
Implement strategy logic
Define parameters and inputs/outputs
Test locally with sample data
Containerize
Write Dockerfile
Build image:
docker build -t myrepo/algo:devTest container locally
Create descriptor
Define parameters and resource requirements
Specify parameter space if optimizing
Validate descriptor syntax
Test in Datafye
Push image to registry
Provision Foundry environment
Run backtests
Iterate on parameters and code
Optimization Workflow
Define parameter space in descriptor
Run genetic algorithm optimization
Review results and scorecards
Select optimal parameters
Update descriptor with optimal values
Validate with out-of-sample backtest
Production Deployment
Finalize parameters based on optimization
Tag stable image version
Update descriptor with production config
Provision Trading environment
Paper trade to validate
Transition to live trading
Best Practices
Descriptor Organization
Keep separate descriptors for different stages:
Development (algo-dev.yaml):
Optimization (algo-optimize.yaml):
Production (algo-prod.yaml):
Version Control
Store descriptors with algo source code
Tag descriptor versions with algo releases
Document parameter changes in commit messages
Use meaningful branch names for experiments
Documentation
Document your algo descriptor:
Testing
Before production:
Test with minimal resources first
Validate parameter ranges make sense
Run backtests with various parameter values
Verify outputs match expectations
Check resource usage matches allocation
Next Steps
Learn about optimization — Genetic Algorithm Based Optimization
Understand backtesting — Backtesting Your Algo
See complete schema — Algo Descriptor Reference
Build your first algo — Building Your First Algo - Using Datafye Container
Last updated: 2025-10-11
Last updated

