Data Access Modes

When accessing data from the Datafye Data Cloud, you have different mechanisms available depending on your deployment scenario and data access patterns. This page explains both the API mechanisms you can use and the data delivery modes available.

API Mechanisms

Datafye provides two primary API mechanisms for accessing the Data Cloud:

REST and WebSocket APIs

The REST and WebSocket APIs are designed for own-container scenarios where you run your own algorithmic trading containers:

  • Foundry: Data Cloud Only - Access market data via REST and WebSocket

  • Trading: Data Cloud + Broker - Access market data and broker connectivity

These APIs provide a standard, language-agnostic interface that works with any containerized application you build:

REST API

  • Request-response pattern for on-demand data queries

  • Fetch reference data, current snapshots, historical data

  • Standard HTTP/JSON interface

  • URLs vary by deployment: localhost:8080, api.rumi.local, or <user>-<type>-<env>-api.datafye.io

WebSocket API

  • Real-time streaming pattern for live market data

  • Subscribe to quotes, trades, aggregates

  • Low-latency push-based updates

  • URLs vary by deployment: localhost:8080, api.rumi.local, or <user>-<type>-<env>-api.datafye.io

See API Reference Overview for complete URL details.

Use REST and WebSocket APIs when:

  • You have existing algo infrastructure you want to preserve

  • You need maximum control over your application architecture

  • You want to use a specific programming language or framework

  • You're integrating Datafye data into an existing system

  • You want to maintain complete control over your proprietary algo logic

Algo Container SDK

The Algo Container SDK is designed for Datafye container scenarios where Datafye runs your algo:

  • Foundry: Full Stack - Build and optimize algos using the SDK

  • Trading: Full Stack - Trade algos built with the SDK

The SDK provides a higher-level, opinionated framework that handles infrastructure concerns for you:

What the SDK Provides:

  • Type-safe market data access with automatic subscription management

  • Event-driven algo lifecycle (onMarketData, onTimer, etc.)

  • Built-in backtesting and optimization support

  • Portfolio and position management abstractions

  • Integrated broker connectivity for trading

  • AI-assisted vibe coding through MCP server integration

Use the Algo Container SDK when:

  • You're building new algos from scratch

  • You want integrated backtesting and optimization

  • You want AI-assisted development (vibe coding)

  • You plan to publish algos to the Datafye Marketplace

  • You prefer a complete managed solution over building infrastructure

Key Differences

Aspect
REST/WebSocket APIs
Algo Container SDK

Container Ownership

You own the container

Datafye owns the container

Language

Any language

Java (Python/TypeScript coming soon)

Infrastructure

You manage

Datafye manages

Backtesting

Bring your own

Integrated

Optimization

Bring your own

Integrated (genetic algorithms)

Marketplace

❌ Not supported

✅ Required for publishing

AI-Assisted Vibe Coding

❌ Not available

✅ Available

Control

Maximum

Opinionated framework

Learning Curve

Steeper (more decisions)

Gentler (guided path)

Data Delivery Modes

Regardless of which API mechanism you use, the Data Cloud delivers data using one of three modes:

Subscription (Live Data)

The subscription mode is used for real-time live data. Consumers subscribe to data of interest, and the Data Cloud pushes updates as new data arrives.

Available via:

  • WebSocket API (for own-container scenarios)

  • Datafye Java Client API

  • Datafye Algo Container SDK

Use subscription for:

  • Real-time quotes and trades

  • Live OHLC bars and aggregates

  • Streaming technical indicators

  • Any scenario requiring immediate data updates

Example (WebSocket API):

Example (Algo SDK):

Streaming (Historical Data)

The streaming mode is used for high-volume historical data retrieval. The Data Cloud opens a private channel and streams historical data directly to the consumer.

Available via:

  • Datafye Java Client API (for own-container scenarios)

  • Datafye Algo Container SDK (for backtesting)

Use streaming for:

  • Backtesting with large historical datasets

  • Parallel replay of multiple symbols

  • High-throughput historical data analysis

  • Time-series analysis requiring sequential data

Characteristics:

  • High parallelism for concurrent streaming

  • Efficient for large date ranges

  • Maintains temporal ordering

  • Supports multiple concurrent streams

Streaming is not available via REST or WebSocket APIs. For own-container scenarios needing high-volume historical data, use the Java Client API.

Fetch (On-Demand Queries)

The fetch mode uses request-response patterns for on-demand data queries. You request specific data and receive an immediate response.

Available via:

  • REST API (for own-container scenarios)

  • Datafye Java Client API

  • Datafye Algo Container SDK

Use fetch for:

  • Reference data lookups (security master)

  • Current market snapshots (last trade, top of book)

  • Specific historical data points

  • On-demand queries without streaming overhead

Example (REST API):

Example (Algo SDK):

Choosing the Right Combination

The optimal combination depends on your scenario and data access patterns:

Own-Container Scenarios

Foundry: Data Cloud Only

  • Use REST API for reference data and historical queries

  • Use WebSocket API for real-time subscription to live data

  • Use Java Client API for high-volume historical streaming (if needed)

Trading: Data Cloud + Broker

  • Same as Data Cloud Only, plus broker connectivity via REST/WebSocket

Datafye Container Scenarios

Foundry: Full Stack

  • SDK uses subscription for live data during development

  • SDK uses streaming for backtesting

  • SDK uses fetch for reference data and on-demand queries

Trading: Full Stack

  • SDK uses subscription for live data during trading

  • SDK uses fetch for reference data and snapshots

Access Mode Comparison

Mode
Data Type
Pattern
Latency
Volume
Best For

Subscription

Live (ticks, aggregates)

Push

Low

Continuous

Real-time trading, live monitoring

Streaming

Historical

Push

Medium

High

Backtesting, bulk analysis

Fetch

All (reference, live, historical)

Pull

Low

Low-Medium

Snapshots, lookups, specific queries

API Organization

Both API mechanisms are organized around the same conceptual structures:

Asset Classes

Data is organized by asset class:

  • stocks - Equity securities

  • crypto - Cryptocurrencies

  • options - Listed options (future)

  • futures - Futures contracts (future)

  • forex - Foreign exchange (future)

Datasets

Both mechanisms access data through datasets (see Datasets):

  • Each dataset contains Reference, Live Ticks, Live Aggregates, and Historical services

  • You specify which dataset to query when making API calls or declaring SDK dependencies

The key difference:

  • REST/WebSocket APIs: You specify the dataset as a parameter (?dataset=SIP)

  • Algo Container SDK: You declare dataset requirements in your Algo Descriptor

Data Categories

Both mechanisms organize data into categories:

  • reference - Static metadata and security master

  • live - Current market data (ticks and aggregates)

  • history - Historical data for backtesting and analysis

Best Practices

For Own-Container Scenarios

  1. Use WebSocket for live data - More efficient than polling REST API

  2. Use REST for reference data - Simple lookups don't need WebSocket overhead

  3. Cache reference data - Security master changes infrequently

  4. Consider Java Client API for backtesting - Streaming mode is much faster than REST for bulk historical data

For Datafye Container Scenarios

  1. Declare data requirements in Algo Descriptor - SDK manages subscriptions automatically

  2. Use fetch sparingly in live trading - Subscription is more efficient

  3. Leverage streaming for backtesting - SDK handles this automatically

  4. Trust the framework - SDK optimizes data access for you

Next Steps

Now that you understand data access modes, learn more about:


Last updated: 2025-10-14

Last updated