Deployment Descriptors

Deployment descriptors are the configuration files that define what gets provisioned in your Datafye environment. They're written in YAML or JSON and tell Datafye exactly what data sources, algos, and broker connections you need.

Think of descriptors as declarative specifications — you describe what you want, and Datafye provisions it. This approach ensures reproducible deployments and makes it easy to version control your infrastructure configuration.

Ready to provision? The Datafye CLI reads your descriptors and provisions your environment. See Private Cloud Deployments to understand deployment models.

Why Descriptors?

Declarative Configuration

Rather than clicking through UI screens or running multiple commands, you write a descriptor that specifies:

What market data you need
What algos you want to run
How to connect to your broker
What resources to allocate

Datafye reads the descriptor and provisions everything accordingly.

Version Control

Since descriptors are text files, you can:

Track changes in git
Review differences between versions
Roll back to previous configurations
Share configurations with team members

This is standard practice in modern infrastructure management (Infrastructure as Code).

Reproducibility

Descriptors ensure that:

Development and production environments match
Team members can provision identical environments
Configurations can be replicated across regions or cloud providers
Environments can be torn down and recreated consistently

Validation

Datafye validates descriptors before provisioning, catching errors like:

Missing required fields
Invalid values or formats
Incompatible combinations
Resource constraint violations

This prevents deployment failures and ensures correct configurations.

The Three Types of Descriptors

Datafye uses three types of descriptors, each serving a specific purpose:

1. Data Descriptors

Data Descriptors define what market data your environment needs.

Specify:

Asset classes — Equities, options, futures, crypto, forex, etc.
Symbols — Which securities to include
Historical ranges — Date ranges for backtesting
Real-time feeds — Live data streams for trading
Alternative data — Custom data sources

Used in: All scenarios (Foundry and Trading, both Data Cloud Only and Full Stack)

Example:

data:
  equities:
    symbols: ["AAPL", "GOOGL", "MSFT"]
    historical:
      start: "2020-01-01"
      end: "2024-12-31"
    realtime: true
  options:
    underlyings: ["SPY"]
    historical:
      start: "2023-01-01"
    realtime: true

2. Algo Descriptors

Algo Descriptors define the algorithms that will run in your environment.

Specify:

Container image — Docker image containing your algo
Parameters — Configuration values for your algo
Parameter space — Ranges for optimization (genetic algorithms)
Resources — CPU, memory, GPU requirements
Inputs — Data feeds the algo consumes
Outputs — Signals and metrics the algo produces

Used in: Full Stack scenarios (both Foundry and Trading)

Example:

algo:
  name: "momentum-strategy"
  image: "myrepo/momentum:v1.0"
  parameters:
    lookback_period: 20
    threshold: 0.02
  parameter_space:
    lookback_period: [10, 50]
    threshold: [0.01, 0.05]
  resources:
    cpu: 2
    memory: "4Gi"

3. Broker Descriptors

Broker Descriptors define connections to brokerage accounts for trade execution.

Specify:

Broker — Which broker to connect to
Account — Account credentials and identifiers
Trading mode — Paper or live trading
Permissions — What the algo can do (trade equities, options, etc.)
Risk limits — Maximum position sizes, daily loss limits, etc.

Used in: Trading scenarios (both Data Cloud + Broker and Full Stack)

Example:

broker:
  provider: "alpaca"
  mode: "paper"
  account:
    api_key: "${ALPACA_API_KEY}"
    api_secret: "${ALPACA_API_SECRET}"
  permissions:
    - "trade_equities"
  limits:
    max_position_size: 10000
    max_daily_loss: 500

How Descriptors Compose

Different Datafye scenarios use different combinations of descriptors:

Foundry: Data Cloud Only

Required:

Data Descriptor

Your responsibility:

Provide your own algo containers
Connect them to the Data Cloud APIs

Datafye provisions:

Data Cloud services based on data descriptor

Foundry: Full Stack

Required:

Data Descriptor
Algo Descriptor

Datafye provisions:

Data Cloud services
Algo Container runtime
Backtesting Engine
MCP Server

Trading: Data Cloud + Broker

Required:

Data Descriptor
Broker Descriptor

Your responsibility:

Provide your own algo containers
Connect them to Data Cloud and Broker Connector

Datafye provisions:

Data Cloud services
Broker Connector to your brokerage

Trading: Full Stack

Required:

Data Descriptor
Algo Descriptor
Broker Descriptor

Datafye provisions:

Data Cloud services
Algo Container runtime
Broker Connector
MCP Server

Descriptor File Formats

Descriptors can be written in YAML or JSON format.

YAML Format

More human-readable, commonly used:

data:
  equities:
    symbols:
      - AAPL
      - GOOGL
    historical:
      start: "2020-01-01"
      end: "2024-12-31"

JSON Format

Machine-friendly, useful for programmatic generation:

{
  "data": {
    "equities": {
      "symbols": ["AAPL", "GOOGL"],
      "historical": {
        "start": "2020-01-01",
        "end": "2024-12-31"
      }
    }
  }
}

Both formats are functionally equivalent — use whichever you prefer.

Descriptor File Organization

Single File Approach

Combine all descriptors in one file:

# complete-descriptor.yaml
data:
  equities:
    symbols: ["AAPL", "GOOGL"]
    historical:
      start: "2020-01-01"
    realtime: true

algo:
  name: "my-strategy"
  image: "myrepo/strategy:v1"
  parameters:
    lookback: 20

broker:
  provider: "alpaca"
  mode: "paper"

Pass to CLI: datafye foundry provision --descriptor complete-descriptor.yaml

Multiple File Approach

Separate descriptors into individual files:

descriptors/
  ├── data.yaml
  ├── algo.yaml
  └── broker.yaml

Pass to CLI: datafye foundry provision --data descriptors/data.yaml --algo descriptors/algo.yaml

Use the approach that makes sense for your workflow and team structure.

Environment Variables and Secrets

Descriptors support environment variable substitution for sensitive values:

broker:
  provider: "alpaca"
  account:
    api_key: "${ALPACA_API_KEY}"      # Reads from environment
    api_secret: "${ALPACA_API_SECRET}"  # Reads from environment

This keeps secrets out of version control while maintaining reproducibility.

Validation

Before provisioning, Datafye validates your descriptors:

Schema Validation

All required fields present
Field types correct (strings, numbers, arrays, etc.)
Values within allowed ranges

Semantic Validation

Referenced resources exist (container images, symbols, etc.)
Combinations are valid (e.g., can't paper trade with live data restrictions)
Resource requests are feasible (enough memory, valid CPU counts, etc.)

Dependency Validation

Data sources support requested asset classes
Broker supports requested permissions
Container images are accessible

If validation fails, Datafye provides clear error messages to help you fix the issues.

Common Patterns

Development Descriptor

Minimal resources, subset of data, for fast iteration:

data:
  equities:
    symbols: ["AAPL"]  # Single symbol for testing
    historical:
      start: "2024-01-01"  # Recent data only
      end: "2024-03-31"

algo:
  name: "test-strategy"
  image: "myrepo/strategy:dev"
  resources:
    cpu: 1  # Minimal resources
    memory: "2Gi"

Production Descriptor

Full data, optimized resources, for live trading:

data:
  equities:
    symbols: ["AAPL", "GOOGL", "MSFT", ...]  # Full universe
    historical:
      start: "2020-01-01"  # Years of history
    realtime: true

algo:
  name: "production-strategy"
  image: "myrepo/strategy:v1.2.3"  # Tagged stable version
  resources:
    cpu: 4  # Production resources
    memory: "16Gi"

broker:
  provider: "alpaca"
  mode: "live"  # Live trading
  limits:
    max_position_size: 10000
    max_daily_loss: 500

Backtesting Descriptor

Historical data only, parameter space for optimization:

data:
  equities:
    symbols: ["AAPL", "GOOGL", "MSFT"]
    historical:
      start: "2020-01-01"
      end: "2024-12-31"
    realtime: false  # No real-time needed for backtesting

algo:
  name: "strategy-backtest"
  image: "myrepo/strategy:v1.0"
  parameters:
    lookback: 20
    threshold: 0.02
  parameter_space:  # For genetic algorithm optimization
    lookback: [10, 50]
    threshold: [0.01, 0.05]

Best Practices

Version Control

Store descriptors in git alongside your algo code
Use meaningful commit messages for descriptor changes
Tag descriptor versions that correspond to algo releases

Parameterization

Use environment variables for secrets and environment-specific values
Keep environment-independent config in the descriptor
Document required environment variables

Modularity

Use separate files for data, algo, and broker descriptors
Create reusable descriptor fragments for common patterns
Share common descriptors across team members

Documentation

Add comments to explain non-obvious configuration choices
Include examples for valid values
Document dependencies between settings

Testing

Validate descriptors locally before provisioning
Use minimal descriptors for development/testing
Gradually increase complexity as you validate functionality

Next Steps

Understand how to use descriptors:
- The Datafye CLI — Learn how the CLI uses descriptors to provision environments
- Private Cloud Deployments — Understand deployment models
Learn about specific descriptor types:
- Data Descriptors — Configure market data
- Algo Descriptors — Define your algos
- Broker Descriptors — Connect to brokers
See complete schemas:
- Descriptor Reference — Complete field documentation
Try provisioning:
- Foundry: Data Cloud Only — Start with data only
- Foundry: Full Stack — Full development environment

Last updated: 2025-10-11

PreviousPrivate Cloud Deployments NextThe Datafye CLI

Last updated 3 months ago

hashtagWhy Descriptors?

hashtagDeclarative Configuration

hashtagVersion Control

hashtagReproducibility

hashtagValidation

hashtagThe Three Types of Descriptors

hashtag1. Data Descriptors

hashtag2. Algo Descriptors

hashtag3. Broker Descriptors

hashtagHow Descriptors Compose

hashtagFoundry: Data Cloud Only

hashtagFoundry: Full Stack

hashtagTrading: Data Cloud + Broker

hashtagTrading: Full Stack

hashtagDescriptor File Formats

hashtagYAML Format

hashtagJSON Format

hashtagDescriptor File Organization

hashtagSingle File Approach

hashtagMultiple File Approach

hashtagEnvironment Variables and Secrets

hashtagValidation

hashtagSchema Validation

hashtagSemantic Validation

hashtagDependency Validation

hashtagCommon Patterns

hashtagDevelopment Descriptor

hashtagProduction Descriptor

hashtagBacktesting Descriptor

hashtagBest Practices

hashtagVersion Control

hashtagParameterization

hashtagModularity

hashtagDocumentation

hashtagTesting

hashtagNext Steps

Why Descriptors?

Declarative Configuration

Version Control

Reproducibility

Validation

The Three Types of Descriptors

1. Data Descriptors

2. Algo Descriptors

3. Broker Descriptors

How Descriptors Compose

Foundry: Data Cloud Only

Foundry: Full Stack

Trading: Data Cloud + Broker

Trading: Full Stack

Descriptor File Formats

YAML Format

JSON Format

Descriptor File Organization

Single File Approach

Multiple File Approach

Environment Variables and Secrets

Validation

Schema Validation

Semantic Validation

Dependency Validation

Common Patterns

Development Descriptor

Production Descriptor

Backtesting Descriptor

Best Practices

Version Control

Parameterization

Modularity

Documentation

Testing

Next Steps