Data Access Modes
When accessing data from the Datafye Data Cloud, you have different mechanisms available depending on your deployment scenario and data access patterns. This page explains both the API mechanisms you can use and the data delivery modes available.
API Mechanisms
Datafye provides two primary API mechanisms for accessing the Data Cloud:
REST and WebSocket APIs
The REST and WebSocket APIs are designed for own-container scenarios where you run your own algorithmic trading containers:
Foundry: Data Cloud Only - Access market data via REST and WebSocket
Trading: Data Cloud + Broker - Access market data and broker connectivity
These APIs provide a standard, language-agnostic interface that works with any containerized application you build:
REST API
Request-response pattern for on-demand data queries
Fetch reference data, current snapshots, historical data
Standard HTTP/JSON interface
URLs vary by deployment:
localhost:8080,api.rumi.local, or<user>-<type>-<env>-api.datafye.io
WebSocket API
Real-time streaming pattern for live market data
Subscribe to quotes, trades, aggregates
Low-latency push-based updates
URLs vary by deployment:
localhost:8080,api.rumi.local, or<user>-<type>-<env>-api.datafye.io
See API Reference Overview for complete URL details.
Use REST and WebSocket APIs when:
You have existing algo infrastructure you want to preserve
You need maximum control over your application architecture
You want to use a specific programming language or framework
You're integrating Datafye data into an existing system
You want to maintain complete control over your proprietary algo logic
Algo Container SDK
The Algo Container SDK is designed for Datafye container scenarios where Datafye runs your algo:
Foundry: Full Stack - Build and optimize algos using the SDK
Trading: Full Stack - Trade algos built with the SDK
The SDK provides a higher-level, opinionated framework that handles infrastructure concerns for you:
What the SDK Provides:
Type-safe market data access with automatic subscription management
Event-driven algo lifecycle (onMarketData, onTimer, etc.)
Built-in backtesting and optimization support
Portfolio and position management abstractions
Integrated broker connectivity for trading
AI-assisted vibe coding through MCP server integration
Use the Algo Container SDK when:
You're building new algos from scratch
You want integrated backtesting and optimization
You want AI-assisted development (vibe coding)
You plan to publish algos to the Datafye Marketplace
You prefer a complete managed solution over building infrastructure
Key Differences
Container Ownership
You own the container
Datafye owns the container
Language
Any language
Java (Python/TypeScript coming soon)
Infrastructure
You manage
Datafye manages
Backtesting
Bring your own
Integrated
Optimization
Bring your own
Integrated (genetic algorithms)
Marketplace
❌ Not supported
✅ Required for publishing
AI-Assisted Vibe Coding
❌ Not available
✅ Available
Control
Maximum
Opinionated framework
Learning Curve
Steeper (more decisions)
Gentler (guided path)
Data Delivery Modes
Regardless of which API mechanism you use, the Data Cloud delivers data using one of three modes:
Subscription (Live Data)
The subscription mode is used for real-time live data. Consumers subscribe to data of interest, and the Data Cloud pushes updates as new data arrives.
Available via:
WebSocket API (for own-container scenarios)
Datafye Java Client API
Datafye Algo Container SDK
Use subscription for:
Real-time quotes and trades
Live OHLC bars and aggregates
Streaming technical indicators
Any scenario requiring immediate data updates
Example (WebSocket API):
Example (Algo SDK):
Streaming (Historical Data)
The streaming mode is used for high-volume historical data retrieval. The Data Cloud opens a private channel and streams historical data directly to the consumer.
Available via:
Datafye Java Client API (for own-container scenarios)
Datafye Algo Container SDK (for backtesting)
Use streaming for:
Backtesting with large historical datasets
Parallel replay of multiple symbols
High-throughput historical data analysis
Time-series analysis requiring sequential data
Characteristics:
High parallelism for concurrent streaming
Efficient for large date ranges
Maintains temporal ordering
Supports multiple concurrent streams
Fetch (On-Demand Queries)
The fetch mode uses request-response patterns for on-demand data queries. You request specific data and receive an immediate response.
Available via:
REST API (for own-container scenarios)
Datafye Java Client API
Datafye Algo Container SDK
Use fetch for:
Reference data lookups (security master)
Current market snapshots (last trade, top of book)
Specific historical data points
On-demand queries without streaming overhead
Example (REST API):
Example (Algo SDK):
Choosing the Right Combination
The optimal combination depends on your scenario and data access patterns:
Own-Container Scenarios
Foundry: Data Cloud Only
Use REST API for reference data and historical queries
Use WebSocket API for real-time subscription to live data
Use Java Client API for high-volume historical streaming (if needed)
Trading: Data Cloud + Broker
Same as Data Cloud Only, plus broker connectivity via REST/WebSocket
Datafye Container Scenarios
Foundry: Full Stack
SDK uses subscription for live data during development
SDK uses streaming for backtesting
SDK uses fetch for reference data and on-demand queries
Trading: Full Stack
SDK uses subscription for live data during trading
SDK uses fetch for reference data and snapshots
Access Mode Comparison
Subscription
Live (ticks, aggregates)
Push
Low
Continuous
Real-time trading, live monitoring
Streaming
Historical
Push
Medium
High
Backtesting, bulk analysis
Fetch
All (reference, live, historical)
Pull
Low
Low-Medium
Snapshots, lookups, specific queries
API Organization
Both API mechanisms are organized around the same conceptual structures:
Asset Classes
Data is organized by asset class:
stocks - Equity securities
crypto - Cryptocurrencies
options - Listed options (future)
futures - Futures contracts (future)
forex - Foreign exchange (future)
Datasets
Both mechanisms access data through datasets (see Datasets):
Each dataset contains Reference, Live Ticks, Live Aggregates, and Historical services
You specify which dataset to query when making API calls or declaring SDK dependencies
The key difference:
REST/WebSocket APIs: You specify the dataset as a parameter (
?dataset=SIP)Algo Container SDK: You declare dataset requirements in your Algo Descriptor
Data Categories
Both mechanisms organize data into categories:
reference - Static metadata and security master
live - Current market data (ticks and aggregates)
history - Historical data for backtesting and analysis
Best Practices
For Own-Container Scenarios
Use WebSocket for live data - More efficient than polling REST API
Use REST for reference data - Simple lookups don't need WebSocket overhead
Cache reference data - Security master changes infrequently
Consider Java Client API for backtesting - Streaming mode is much faster than REST for bulk historical data
For Datafye Container Scenarios
Declare data requirements in Algo Descriptor - SDK manages subscriptions automatically
Use fetch sparingly in live trading - Subscription is more efficient
Leverage streaming for backtesting - SDK handles this automatically
Trust the framework - SDK optimizes data access for you
Next Steps
Now that you understand data access modes, learn more about:
Datasets - Understanding what datasets are and how they're structured
Data APIs and Datasets - How REST/WebSocket APIs relate to datasets
Data Descriptors - How to configure your data requirements
REST API Reference - Detailed REST API documentation
WebSocket API Reference - Detailed WebSocket API documentation
Algo Container SDK Reference - Detailed SDK documentation
Last updated: 2025-10-14
Last updated

