The Datafye CLI

The Datafye CLI (datafye) is the command-line tool you use to provision, manage, and control your Datafye environments. It's the primary interface for creating private cloud deployments, whether you're setting up a development Foundry or a production Trading Environment.

The Datafye CLI officially supports macOS and Linux. Windows users can run the CLI via WSL (Windows Subsystem for Linux).

New to Datafye deployments? Start with Private Cloud Deployments to understand deployment models, then review Descriptors to learn about configuration files.

What the CLI Does

The Datafye CLI bridges the gap between your configuration (descriptors) and your infrastructure (private cloud deployments). Here's the flow:

  1. You create descriptors — YAML or JSON files specifying what data, algos, and broker connections you need

  2. You run CLI commands — The CLI reads your descriptors and provisions the infrastructure

  3. Datafye creates your environment — A complete, isolated deployment with the services you specified

The CLI handles all the complexity: spinning up services, configuring networking, connecting to data providers, deploying containers, and ensuring everything works together.

Core Workflow

Every Datafye environment follows the same lifecycle, managed through CLI commands:

1. Provision

Create a new environment from descriptors. The command structure varies by deployment environment:

# Standalone Foundry (single deployment per machine)
datafye foundry local provision --descriptor foundry.yaml

# Distributed Foundry on AWS (requires name and cloud-specific params)
datafye foundry aws provision \
  --name my-dev-foundry \
  --descriptor foundry.yaml \
  --profile my-aws-profile \
  --cidr 10.0.0.0/16

# Standalone Trading environment
datafye trading local provision --descriptor trading.yaml

# Distributed Trading environment on AWS
datafye trading aws provision \
  --name production-trading \
  --descriptor trading.yaml \
  --profile my-aws-profile \
  --cidr 10.1.0.0/16

What happens during provisioning:

  • CLI validates your descriptors (schema, semantics, dependencies)

  • Infrastructure is allocated based on your deployment model (Standalone, Distributed Self-managed, or Distributed Datafye Managed)

  • Services are deployed (Data Cloud, Algo Container, Backtesting Engine, Broker Connector — depending on scenario)

  • Network connectivity is established

  • Data providers and brokers are connected using your credentials

  • Health checks verify everything is running correctly

Provisioning time:

  • Local deployments: 5-10 minutes depending on internet connection and system resources

  • AWS deployments: 10-20 minutes

2. Start

Start a provisioned environment that's currently stopped:

Starting an environment is much faster than provisioning (typically under a minute) because the infrastructure already exists — services just need to resume.

3. Stop

Stop a running environment without deleting it:

Stopping preserves your environment configuration and data while freeing up compute resources. This is useful for:

  • Saving costs when you're not actively using the environment

  • Performing maintenance or updates

  • Pausing development work

4. Deprovision

Completely remove an environment and all its resources:

How the CLI Uses Descriptors

The CLI's power comes from its descriptor-driven approach. Instead of remembering dozens of command-line flags and options, you declare what you want in a descriptor file.

Descriptor Input Methods

Single combined descriptor:

Where complete.yaml contains data, algo, and broker sections:

Separate descriptor files:

This modular approach lets you:

  • Reuse common descriptors across environments (e.g., same data descriptor for dev and prod)

  • Version control each descriptor type independently

  • Share descriptor fragments with team members

Learn more: See Descriptors for complete details on descriptor format and organization.

Deployment Model Selection

The CLI uses environment-specific commands to handle the differences between deployment models. The environment is specified as part of the command (e.g., local, aws, azure, gcp), and each environment accepts different parameters.

Note: Developers can provision Standalone (local) and Distributed Self-managed (aws/azure/gcp) deployments using the CLI. Distributed Datafye Managed deployments (required for live trading) can only be provisioned by Datafye Ops. To get a managed deployment, contact Datafye to request provisioning.

Standalone Deployment

All services run on your local machine (or a single cloud VM). Commands use the local environment:

Key difference: Local deployments do not use a --name parameter since only one deployment is allowed per machine. You provide descriptor file(s) to specify what to provision.

See CLI Command Reference for complete parameter lists.

The CLI:

  • Uses Docker to run all services as containers on your machine

  • Sets up Docker networks for inter-service communication

  • Binds ports to localhost for API access

  • Manages container lifecycle (start, stop, health checks)

Best for: Development, learning, paper trading on local hardware

Distributed Self-managed

Services spread across multiple machines in your cloud account. Commands use cloud provider names (aws, azure, gcp):

Key differences: Distributed deployments require:

  • --name — A unique name for the deployment (multiple deployments can exist in the same cloud account)

  • --profile — The AWS CLI profile to use for provisioning

  • Descriptor file(s) specifying what to provision

  • Optional cloud-specific parameters like --region and --cidr

See CLI Command Reference for complete parameter lists.

The CLI:

  • Provisions cloud resources (VMs, networking, storage) in your account

  • Deploys services across multiple machines for scalability

  • Sets up private networking (VPCs, subnets, security groups)

  • Configures monitoring and logging

  • Uses native cloud infrastructure (no Docker/containerization)

Best for: Production-scale development, paper trading with cloud resources, teams with cloud expertise

Distributed Datafye Managed

Datafye Ops Only: Distributed Datafye Managed deployments can only be provisioned by Datafye Ops. Developers cannot provision these environments directly using the CLI. To request a managed deployment, contact Datafye support.

Services spread across multiple machines in a Datafye-operated cloud account, provisioned and managed by Datafye Ops:

What Datafye Ops does:

  • Provisions infrastructure in a dedicated account under Datafye's root

  • Deploys services across multiple machines for scalability and resilience

  • Handles all operations, monitoring, updates, and compliance

  • Uses native cloud infrastructure (no Docker/containerization)

  • Configures for live trading requirements when applicable

Best for: Live trading (required for live), production environments, teams wanting fully managed infrastructure

To request: Contact Datafye with your descriptors and deployment requirements.

Learn more: See Private Cloud Deployments for detailed comparison of deployment models.

Scenario-Specific Provisioning

Different scenarios require different combinations of descriptors:

Foundry: Data Cloud Only

Descriptors required: Data only

Standalone:

Distributed (AWS example):

What gets provisioned: Data Cloud services for historical and live replay data.

You bring: Your own algo containers that connect to Data Cloud APIs.

Foundry: Full Stack

Descriptors required: Data + Algo

Standalone:

Distributed (AWS example):

What gets provisioned: Data Cloud, Algo Container runtime, Backtesting Engine, MCP Server for AI-assisted vibe coding.

You bring: Nothing — complete stack provided by Datafye.

Trading: Data Cloud + Broker

Descriptors required: Data + Broker

For paper trading - Standalone:

For paper trading - Distributed Self-managed (AWS example):

For live trading: Contact Datafye Ops to provision a Distributed Datafye Managed deployment.

What gets provisioned: Data Cloud services for live market data, Broker Connector for trade execution.

You bring: Your own algo containers that connect to Data Cloud and Broker Connector.

Trading: Full Stack

Descriptors required: Data + Algo + Broker

For paper trading - Standalone:

For paper trading - Distributed Self-managed (AWS example):

For live trading: Contact Datafye Ops to provision a Distributed Datafye Managed deployment.

What gets provisioned: Data Cloud, Algo Container runtime, Broker Connector, MCP Server for AI-assisted vibe coding.

You bring: Nothing — complete trading stack provided by Datafye.

Learn more: See Choose Your Path for detailed scenario descriptions.

CLI Configuration

The CLI stores configuration in ~/.datafye/config.yaml:

This configuration provides defaults so you don't have to specify common options on every command.

Environment Variables

The CLI supports environment variable substitution in descriptors, keeping secrets out of version control:

In your descriptor:

When you run the CLI:

The CLI reads environment variables and substitutes values before provisioning.

Validation and Error Handling

Before provisioning, the CLI validates your descriptors to catch errors early:

Validation Checks

Schema validation:

  • All required fields present

  • Correct data types (strings, numbers, arrays)

  • Values within valid ranges

Semantic validation:

  • Referenced resources exist (container images, data symbols)

  • Combinations are valid (e.g., can't request live trading in Standalone deployment)

  • Resource requests are feasible (enough memory, valid CPU counts)

Dependency validation:

  • Data providers support requested asset classes

  • Broker supports requested permissions

  • Container images are accessible from the deployment environment

Error Messages

When validation fails, the CLI provides clear, actionable error messages:

This prevents provisioning failures and saves time troubleshooting.

Status and Monitoring

Check the status of your environments:

Status output shows:

  • Environment ID and name

  • Deployment model (Standalone, Distributed Self-managed, Distributed Datafye Managed)

  • Current state (provisioning, running, stopped, failed)

  • Services running (Data Cloud, Algo Container, etc.)

  • Resource usage (CPU, memory, network)

  • Health check status

Common Workflows

Development Workflow

  1. Create minimal descriptors for fast iteration

  2. Provision Standalone deployment on your laptop

  3. Develop and test your algo locally

  4. Stop environment when not in use to save resources

Testing Workflow

  1. Create full descriptors with production data

  2. Provision Distributed Self-managed for cloud resources

  3. Run comprehensive backtests with parallelization

  4. Validate results and optimize parameters

  5. Deprovision when testing is complete

Production Workflow

  1. Create production descriptors with live data and broker

  2. For paper trading: Provision Standalone or Distributed Self-managed to validate

  3. For live trading: Contact Datafye Ops to provision Distributed Datafye Managed deployment

  4. Datafye Ops handles all operations and monitoring for managed environments

Prerequisites and Installation

The Datafye CLI requires Java 17+ and Docker (for local deployments). For AWS deployments, you'll also need the AWS CLI v2 configured with your credentials.

Complete installation guide: See CLI Installation for detailed prerequisites, installation instructions, and troubleshooting.

Command Reference

This page provides a conceptual overview of the CLI. For detailed command syntax, parameters, and examples, see the CLI Command Reference.

Next Steps

  • Install the CLI — Follow CLI Installation for step-by-step installation instructions

  • Understand deployment models — Read Private Cloud Deployments to understand Standalone, Distributed Self-managed, and Distributed Datafye Managed deployments

  • Learn about descriptors — Review Descriptors to understand how to configure your environments

  • Get started with a scenario — Choose your path in Choose Your Path

  • See command details — Browse CLI Command Reference for specific command syntax


Last updated: 2025-10-22

Last updated