Deployment Descriptors

Deployment descriptors are the configuration files that define what gets provisioned in your Datafye environment. They're written in YAML or JSON and tell Datafye exactly what data sources, algos, and broker connections you need.

Think of descriptors as declarative specifications — you describe what you want, and Datafye provisions it. This approach ensures reproducible deployments and makes it easy to version control your infrastructure configuration.

Ready to provision? The Datafye CLI reads your descriptors and provisions your environment. See Private Cloud Deployments to understand deployment models.

Why Descriptors?

Declarative Configuration

Rather than clicking through UI screens or running multiple commands, you write a descriptor that specifies:

  • What market data you need

  • What algos you want to run

  • How to connect to your broker

  • What resources to allocate

Datafye reads the descriptor and provisions everything accordingly.

Version Control

Since descriptors are text files, you can:

  • Track changes in git

  • Review differences between versions

  • Roll back to previous configurations

  • Share configurations with team members

This is standard practice in modern infrastructure management (Infrastructure as Code).

Reproducibility

Descriptors ensure that:

  • Development and production environments match

  • Team members can provision identical environments

  • Configurations can be replicated across regions or cloud providers

  • Environments can be torn down and recreated consistently

Validation

Datafye validates descriptors before provisioning, catching errors like:

  • Missing required fields

  • Invalid values or formats

  • Incompatible combinations

  • Resource constraint violations

This prevents deployment failures and ensures correct configurations.

The Three Types of Descriptors

Datafye uses three types of descriptors, each serving a specific purpose:

1. Data Descriptors

Data Descriptors define what market data your environment needs.

Specify:

  • Asset classes — Equities, options, futures, crypto, forex, etc.

  • Symbols — Which securities to include

  • Historical ranges — Date ranges for backtesting

  • Real-time feeds — Live data streams for trading

  • Alternative data — Custom data sources

Used in: All scenarios (Foundry and Trading, both Data Cloud Only and Full Stack)

Example:

2. Algo Descriptors

Algo Descriptors define the algorithms that will run in your environment.

Specify:

  • Container image — Docker image containing your algo

  • Parameters — Configuration values for your algo

  • Parameter space — Ranges for optimization (genetic algorithms)

  • Resources — CPU, memory, GPU requirements

  • Inputs — Data feeds the algo consumes

  • Outputs — Signals and metrics the algo produces

Used in: Full Stack scenarios (both Foundry and Trading)

Example:

3. Broker Descriptors

Broker Descriptors define connections to brokerage accounts for trade execution.

Specify:

  • Broker — Which broker to connect to

  • Account — Account credentials and identifiers

  • Trading mode — Paper or live trading

  • Permissions — What the algo can do (trade equities, options, etc.)

  • Risk limits — Maximum position sizes, daily loss limits, etc.

Used in: Trading scenarios (both Data Cloud + Broker and Full Stack)

Example:

How Descriptors Compose

Different Datafye scenarios use different combinations of descriptors:

Foundry: Data Cloud Only

Required:

  • Data Descriptor

Your responsibility:

  • Provide your own algo containers

  • Connect them to the Data Cloud APIs

Datafye provisions:

  • Data Cloud services based on data descriptor

Foundry: Full Stack

Required:

  • Data Descriptor

  • Algo Descriptor

Datafye provisions:

  • Data Cloud services

  • Algo Container runtime

  • Backtesting Engine

  • MCP Server

Trading: Data Cloud + Broker

Required:

  • Data Descriptor

  • Broker Descriptor

Your responsibility:

  • Provide your own algo containers

  • Connect them to Data Cloud and Broker Connector

Datafye provisions:

  • Data Cloud services

  • Broker Connector to your brokerage

Trading: Full Stack

Required:

  • Data Descriptor

  • Algo Descriptor

  • Broker Descriptor

Datafye provisions:

  • Data Cloud services

  • Algo Container runtime

  • Broker Connector

  • MCP Server

Descriptor File Formats

Descriptors can be written in YAML or JSON format.

YAML Format

More human-readable, commonly used:

JSON Format

Machine-friendly, useful for programmatic generation:

Both formats are functionally equivalent — use whichever you prefer.

Descriptor File Organization

Single File Approach

Combine all descriptors in one file:

Pass to CLI: datafye foundry provision --descriptor complete-descriptor.yaml

Multiple File Approach

Separate descriptors into individual files:

Pass to CLI: datafye foundry provision --data descriptors/data.yaml --algo descriptors/algo.yaml

Use the approach that makes sense for your workflow and team structure.

Environment Variables and Secrets

Descriptors support environment variable substitution for sensitive values:

This keeps secrets out of version control while maintaining reproducibility.

Validation

Before provisioning, Datafye validates your descriptors:

Schema Validation

  • All required fields present

  • Field types correct (strings, numbers, arrays, etc.)

  • Values within allowed ranges

Semantic Validation

  • Referenced resources exist (container images, symbols, etc.)

  • Combinations are valid (e.g., can't paper trade with live data restrictions)

  • Resource requests are feasible (enough memory, valid CPU counts, etc.)

Dependency Validation

  • Data sources support requested asset classes

  • Broker supports requested permissions

  • Container images are accessible

If validation fails, Datafye provides clear error messages to help you fix the issues.

Common Patterns

Development Descriptor

Minimal resources, subset of data, for fast iteration:

Production Descriptor

Full data, optimized resources, for live trading:

Backtesting Descriptor

Historical data only, parameter space for optimization:

Best Practices

Version Control

  • Store descriptors in git alongside your algo code

  • Use meaningful commit messages for descriptor changes

  • Tag descriptor versions that correspond to algo releases

Parameterization

  • Use environment variables for secrets and environment-specific values

  • Keep environment-independent config in the descriptor

  • Document required environment variables

Modularity

  • Use separate files for data, algo, and broker descriptors

  • Create reusable descriptor fragments for common patterns

  • Share common descriptors across team members

Documentation

  • Add comments to explain non-obvious configuration choices

  • Include examples for valid values

  • Document dependencies between settings

Testing

  • Validate descriptors locally before provisioning

  • Use minimal descriptors for development/testing

  • Gradually increase complexity as you validate functionality

Next Steps


Last updated: 2025-10-11

Last updated