Data Access Modes

When accessing data from the Datafye Data Cloud, you have different mechanisms available depending on your deployment scenario and data access patterns. This page explains both the API mechanisms you can use and the data delivery modes available.

API Mechanisms

Datafye provides two primary API mechanisms for accessing the Data Cloud:

REST and WebSocket APIs

The REST and WebSocket APIs are designed for own-container scenarios where you run your own algorithmic trading containers:

Foundry: Data Cloud Only - Access market data via REST and WebSocket
Trading: Data Cloud + Broker - Access market data and broker connectivity

These APIs provide a standard, language-agnostic interface that works with any containerized application you build:

REST API

Request-response pattern for on-demand data queries
Fetch reference data, current snapshots, historical data
Standard HTTP/JSON interface
URLs vary by deployment: localhost:8080, api.rumi.local, or <user>-<type>-<env>-api.datafye.io

WebSocket API

Real-time streaming pattern for live market data
Subscribe to quotes, trades, aggregates
Low-latency push-based updates
URLs vary by deployment: localhost:8080, api.rumi.local, or <user>-<type>-<env>-api.datafye.io

See API Reference Overview for complete URL details.

Use REST and WebSocket APIs when:

You have existing algo infrastructure you want to preserve
You need maximum control over your application architecture
You want to use a specific programming language or framework
You're integrating Datafye data into an existing system
You want to maintain complete control over your proprietary algo logic

Algo Container SDK

The Algo Container SDK is designed for Datafye container scenarios where Datafye runs your algo:

Foundry: Full Stack - Build and optimize algos using the SDK
Trading: Full Stack - Trade algos built with the SDK

The SDK provides a higher-level, opinionated framework that handles infrastructure concerns for you:

What the SDK Provides:

Type-safe market data access with automatic subscription management
Event-driven algo lifecycle (onMarketData, onTimer, etc.)
Built-in backtesting and optimization support
Portfolio and position management abstractions
Integrated broker connectivity for trading
AI-assisted vibe coding through MCP server integration

Use the Algo Container SDK when:

You're building new algos from scratch
You want integrated backtesting and optimization
You want AI-assisted development (vibe coding)
You plan to publish algos to the Datafye Marketplace
You prefer a complete managed solution over building infrastructure

Key Differences

Aspect

REST/WebSocket APIs

Algo Container SDK

Container Ownership

You own the container

Datafye owns the container

Language

Any language

Java (Python/TypeScript coming soon)

Infrastructure

You manage

Datafye manages

Backtesting

Bring your own

Integrated

Optimization

Bring your own

Integrated (genetic algorithms)

Marketplace

❌ Not supported

✅ Required for publishing

AI-Assisted Vibe Coding

❌ Not available

✅ Available

Control

Maximum

Opinionated framework

Learning Curve

Steeper (more decisions)

Gentler (guided path)

Data Delivery Modes

Regardless of which API mechanism you use, the Data Cloud delivers data using one of three modes:

Subscription (Live Data)

The subscription mode is used for real-time live data. Consumers subscribe to data of interest, and the Data Cloud pushes updates as new data arrives.

Available via:

WebSocket API (for own-container scenarios)
Datafye Java Client API
Datafye Algo Container SDK

Use subscription for:

Real-time quotes and trades
Live OHLC bars and aggregates
Streaming technical indicators
Any scenario requiring immediate data updates

Example (WebSocket API):

// URL varies by deployment - see API Reference for details
const ws = new WebSocket('ws://api.rumi.local/datafye-ws/v1/market-data');

// Subscribe to live quotes
ws.send(JSON.stringify({
  action: "subscribe",
  channel: "quotes",
  symbols: ["AAPL", "MSFT"],
  dataset: "SIP"
}));

// Receive real-time updates
ws.onmessage = (event) => {
  const quote = JSON.parse(event.data);
  console.log(`${quote.symbol}: bid=${quote.bid}, ask=${quote.ask}`);
};

Example (Algo SDK):

@OnMarketData
public void onQuote(Quote quote) {
  // SDK automatically manages subscriptions based on your Algo Descriptor
  logger.info("Received quote: {} bid={} ask={}",
    quote.getSymbol(), quote.getBid(), quote.getAsk());
}

Streaming (Historical Data)

The streaming mode is used for high-volume historical data retrieval. The Data Cloud opens a private channel and streams historical data directly to the consumer.

Available via:

Datafye Java Client API (for own-container scenarios)
Datafye Algo Container SDK (for backtesting)

Use streaming for:

Backtesting with large historical datasets
Parallel replay of multiple symbols
High-throughput historical data analysis
Time-series analysis requiring sequential data

Characteristics:

High parallelism for concurrent streaming
Efficient for large date ranges
Maintains temporal ordering
Supports multiple concurrent streams

Streaming is not available via REST or WebSocket APIs. For own-container scenarios needing high-volume historical data, use the Java Client API.

Fetch (On-Demand Queries)

The fetch mode uses request-response patterns for on-demand data queries. You request specific data and receive an immediate response.

Available via:

REST API (for own-container scenarios)
Datafye Java Client API
Datafye Algo Container SDK

Use fetch for:

Reference data lookups (security master)
Current market snapshots (last trade, top of book)
Specific historical data points
On-demand queries without streaming overhead

Example (REST API):

# URL varies by deployment - see API Reference for details

# Fetch current day OHLCs
curl "http://api.rumi.local/datafye-api/v1/stocks/live/aggregates/ohlcs?symbols=AAPL&dataset=SIP"

# Fetch historical data for specific date range
curl "http://api.rumi.local/datafye-api/v1/stocks/history/ohlcs?symbol=AAPL&from=2024-01-01T00:00:00Z&to=2024-01-31T23:59:59Z&interval=1d&dataset=SIP"

Example (Algo SDK):

// Fetch reference data on demand
SecurityMaster securityMaster = dataCloud.getSecurityMaster("AAPL");

// Fetch historical OHLCs for specific period
List<OHLC> historicalBars = dataCloud.getHistoricalOHLCs(
  "AAPL",
  LocalDate.of(2024, 1, 1),
  LocalDate.of(2024, 1, 31),
  Interval.ONE_DAY
);

Choosing the Right Combination

The optimal combination depends on your scenario and data access patterns:

Own-Container Scenarios

Foundry: Data Cloud Only

Use REST API for reference data and historical queries
Use WebSocket API for real-time subscription to live data
Use Java Client API for high-volume historical streaming (if needed)

Trading: Data Cloud + Broker

Same as Data Cloud Only, plus broker connectivity via REST/WebSocket

Datafye Container Scenarios

Foundry: Full Stack

SDK uses subscription for live data during development
SDK uses streaming for backtesting
SDK uses fetch for reference data and on-demand queries

Trading: Full Stack

SDK uses subscription for live data during trading
SDK uses fetch for reference data and snapshots

Access Mode Comparison

Mode

Data Type

Pattern

Latency

Volume

Best For

Subscription

Live (ticks, aggregates)

Push

Low

Continuous

Real-time trading, live monitoring

Streaming

Historical

Push

Medium

High

Backtesting, bulk analysis

Fetch

All (reference, live, historical)

Pull

Low

Low-Medium

Snapshots, lookups, specific queries

API Organization

Both API mechanisms are organized around the same conceptual structures:

Asset Classes

Data is organized by asset class:

stocks - Equity securities
crypto - Cryptocurrencies
options - Listed options (future)
futures - Futures contracts (future)
forex - Foreign exchange (future)

Datasets

Both mechanisms access data through datasets (see Datasets):

Each dataset contains Reference, Live Ticks, Live Aggregates, and Historical services
You specify which dataset to query when making API calls or declaring SDK dependencies

The key difference:

REST/WebSocket APIs: You specify the dataset as a parameter (?dataset=SIP)
Algo Container SDK: You declare dataset requirements in your Algo Descriptor

Data Categories

Both mechanisms organize data into categories:

reference - Static metadata and security master
live - Current market data (ticks and aggregates)
history - Historical data for backtesting and analysis

Best Practices

For Own-Container Scenarios

Use WebSocket for live data - More efficient than polling REST API
Use REST for reference data - Simple lookups don't need WebSocket overhead
Cache reference data - Security master changes infrequently
Consider Java Client API for backtesting - Streaming mode is much faster than REST for bulk historical data

For Datafye Container Scenarios

Declare data requirements in Algo Descriptor - SDK manages subscriptions automatically
Use fetch sparingly in live trading - Subscription is more efficient
Leverage streaming for backtesting - SDK handles this automatically
Trust the framework - SDK optimizes data access for you

Next Steps

Now that you understand data access modes, learn more about:

Datasets - Understanding what datasets are and how they're structured
Data APIs and Datasets - How REST/WebSocket APIs relate to datasets
Data Descriptors - How to configure your data requirements
REST API Reference - Detailed REST API documentation
WebSocket API Reference - Detailed WebSocket API documentation
Algo Container SDK Reference - Detailed SDK documentation

Last updated: 2025-10-14

PreviousDatasets NextData APIs and Datasets

Last updated 2 months ago

hashtagAPI Mechanisms

hashtagREST and WebSocket APIs

hashtagAlgo Container SDK

hashtagKey Differences

hashtagData Delivery Modes

hashtagSubscription (Live Data)

hashtagStreaming (Historical Data)

hashtagFetch (On-Demand Queries)

hashtagChoosing the Right Combination

hashtagOwn-Container Scenarios

hashtagDatafye Container Scenarios

hashtagAccess Mode Comparison

hashtagAPI Organization

hashtagAsset Classes

hashtagDatasets

hashtagData Categories

hashtagBest Practices

hashtagFor Own-Container Scenarios

hashtagFor Datafye Container Scenarios

hashtagNext Steps

API Mechanisms

REST and WebSocket APIs

Algo Container SDK

Key Differences

Data Delivery Modes

Subscription (Live Data)

Streaming (Historical Data)

Fetch (On-Demand Queries)

Choosing the Right Combination

Own-Container Scenarios

Datafye Container Scenarios

Access Mode Comparison

API Organization

Asset Classes

Datasets

Data Categories

Best Practices

For Own-Container Scenarios

For Datafye Container Scenarios

Next Steps