Data Descriptors
Complete schema reference for Datafye Data Descriptors.
Overview
A DataSpec defines what market data your algo needs, including:
Which datasets to use (SIP, PrecisionAlpha, TotalView)
Which symbols to subscribe to
What tick and aggregate schemas to stream
How long to retain each type of data
What mode to operate in (live, paper, backtest)
Schema
Root Structure
apiVersion: datafye.io/v1
kind: DataSpec
metadata:
name: <string>
description: <string>
requestedBy:
actorType: user | algo
actorId: <string>
mode: live | paper | backtest
datasets:
- <dataset-object>Field: apiVersion
apiVersionType: string
Required: Yes
Value:
datafye.io/v1Description: Schema version for this descriptor
Field: kind
kindType: string
Required: Yes
Value:
DataSpecDescription: Identifies this as a Data Descriptor
Field: metadata
metadataType: object
Required: Yes
Description: Metadata about this descriptor
metadata.name
metadata.nameType: string
Required: Yes
Format: lowercase alphanumeric with hyphens (DNS-1123 subdomain)
Description: Unique identifier for this data specification
Example:
my-trading-data,backtest-2024-q1
metadata.description
metadata.descriptionType: string
Required: No
Description: Human-readable description of this data specification
Example:
SIP trades and quotes for momentum strategy
metadata.requestedBy
metadata.requestedByType: object
Required: No
Description: Actor requesting this data (used for audit and billing)
metadata.requestedBy.actorType
Type: string
Required: Yes (if requestedBy present)
Values:
user|algoDescription: Type of actor requesting data
metadata.requestedBy.actorId
Type: string
Required: Yes (if requestedBy present)
Description: Unique identifier for the requesting actor
Example:
user-123,algo-momentum-v2
Field: mode
modeType: string
Required: Yes
Values:
live|paper|backtestDescription: Operating mode for this data specification
live
Real-time production trading
paper
Paper trading with live data streams
backtest
Historical data only, no live streams
Field: datasets
datasetsType: array of dataset objects
Required: Yes
Description: List of datasets to provision
Dataset Object
Field: dataset
datasetType: string
Required: Yes
Values:
SIP|PrecisionAlpha|TotalView|SyntheticDescription: Which dataset to use
SIP
US equities trades and quotes
Polygon
PrecisionAlpha
Alternative data signals
PrecisionAlpha
TotalView
NASDAQ Level 2 depth
NASDAQ
Synthetic
Synthetic market data for testing
Datafye
Field: provider
providerType: string
Required: No (defaults based on dataset)
Description: Data provider for this dataset
SIP
Polygon
PrecisionAlpha
PrecisionAlpha
TotalView
NASDAQ
Synthetic
Datafye
Field: symbols
symbolsType: object
Required: Yes
Description: Which symbols to subscribe to
symbols.tickers
symbols.tickersType: array of strings
Required: No
Description: Explicit ticker symbols or wildcards
Examples:
["AAPL", "GOOGL", "MSFT"]— Explicit list["NV*"]— Wildcard pattern (all tickers starting with NV)["*"]— All available symbols
symbols.universes
symbols.universesType: array of strings
Required: No
Description: Pre-defined symbol universes
Values:
SP500|NDX100|RUSSELL2000|RUSSELL1000
Note: tickers and universes can be used together — the effective symbol list is the union of both.
Field: reference
referenceType: boolean
Required: No
Default:
falseDescription: Whether to provision reference data (security master, corporate actions, calendar)
Recommendation: Set to
truefor trading algos
Field: live
liveType: object
Required: Yes (if mode is
liveorpaper)Description: Real-time data streams to subscribe to
live.ticks
live.ticksType: string (comma-separated schema list)
Required: No
Special values:
none,all,*Description: Which tick schemas to stream in real-time
Examples:
"trades"— Trades only"trades,quotes"— Trades and quotes"all"or"*"— All available tick schemas"none"— No tick streams
live.aggregates
live.aggregatesType: string (comma-separated schema list)
Required: No
Special values:
noneDescription: Which aggregate schemas to stream in real-time
Examples:
"ohlc-1m"— 1-minute bars only"ohlc-1m,ema-1m-20,vwap-1m"— Multiple aggregates"none"— No aggregate streams
Field: history
historyType: object
Required: No
Description: Historical data retention policies
history.ticks
history.ticksType: array of retention objects
Required: No
Description: Retention policy per tick schema
Retention object structure:
schema: Which tick schema (e.g.,trades,quotes)duration: How long to retain (e.g.,7d,30d,P90D)
history.aggregates
history.aggregatesType: array of retention objects
Required: No
Description: Retention policy per aggregate schema
Retention object structure:
schema: Which aggregate schema (e.g.,ohlc-1m,ema-1m-20)duration: How long to retain (e.g.,30d,180d,1y)
history.reference
history.referenceType: object
Required: No
Description: Retention policy for reference data
Structure:
duration: How long to retain reference data (e.g.,365d,1y)
Duration Format
Durations can be specified in two formats:
Simple
<number><unit>
7d, 30d, 6m, 1y
ISO-8601
P<number><unit>
P7D, P6M, P1Y
Units:
d— Daysm— Months (30 days)y— Years (365 days)
Special value: 0d means no retention (live only)
Tick Schemas by Dataset
SIP Tick Schemas
trades
Trade ticks
quotes
NBBO quotes
Synthetic Tick Schemas
trades
Synthetic trade ticks
quotes
Synthetic NBBO quotes
PrecisionAlpha Tick Schemas
pa-1s
1-second signals
pa-1m
1-minute signals
pa-1d
Daily signals
TotalView Tick Schemas
depth
Level 2 order book depth
Aggregate Schemas
Aggregate schemas are available across all datasets (applied to the dataset's tick data):
ohlc-1s
1-second OHLC bars
ohlc-1m
1-minute OHLC bars
ohlc-1h
1-hour OHLC bars
ohlc-1d
1-day OHLC bars
ema-1m-20
1-minute 20-period EMA
ema-1m-50
1-minute 50-period EMA
vwap-1m
1-minute VWAP
signal-1m-12-26-9
MACD signal (12, 26, 9)
Pattern: <type>-<interval>-<params>
Complete Examples
Example 1: Basic Live Trading
Example 2: Multi-Dataset with PrecisionAlpha
Example 3: Backtest Configuration
Example 4: Wildcard Symbols
Validation Rules
Required Fields
apiVersionmust bedatafye.io/v1kindmust beDataSpecmetadata.nameis requiredmodeis requireddatasetsmust have at least one dataset
Mode-Specific Rules
live/paper: Must have
livesection with at least one tick or aggregate schemabacktest: Must have
historysection,livesection ignored if present
Symbol Rules
Must specify at least one of
tickersoruniversesWildcard
*cannot be mixed with other ticker patternsEmpty arrays not allowed
Duration Rules
Must be positive value (except
0dfor no retention)Cannot exceed 3 years (
3yorP1095D)Tick retention typically shorter than aggregate retention
Schema Rules
Tick schemas must be valid for the chosen dataset
Aggregate schemas are universal across datasets
Cannot specify
allfor aggregates (only for ticks)
Related Documentation
Concept guide: Data Descriptors
Quickstart: Understanding the Data Descriptor
CLI usage: foundry provision, trading provision
Last updated

