Scorecarding

Scorecarding is the systematic evaluation of algo performance using industry-standard metrics and risk-adjusted return measures. In institutional trading, scorecarding is table stakes — it's how professionals assess whether a strategy is worth deploying capital.

Datafye provides comprehensive scorecarding for all algos developed in the Full Stack Foundry, ensuring that strategies are rigorously evaluated before they see real money.

Why Scorecarding Matters

Objective Performance Assessment

Scorecarding provides objective, quantitative measures of algo performance:

No guesswork — Clear metrics tell you if a strategy works
Apples-to-apples comparisons — Compare different strategies fairly
Risk-adjusted returns — Understand returns relative to risk taken
Industry standards — Use the same metrics as professional funds

Without scorecarding, you're flying blind — you might see profits but not understand the risks you took to achieve them.

Risk Management

Scorecards reveal the risk characteristics of your strategy:

Maximum drawdown — Worst peak-to-trough decline
Volatility — How much returns fluctuate
Tail risk — Probability of extreme losses
Correlation — How the strategy behaves relative to markets

Understanding risk is crucial for position sizing, capital allocation, and avoiding catastrophic losses.

Strategy Comparison

Scorecarding enables informed decision-making:

Which algo to trade? — Deploy capital to the best strategies
How to allocate? — Size positions based on risk-adjusted returns
When to stop trading? — Detect when a strategy stops working
Portfolio construction — Combine strategies for optimal diversification

Marketplace Requirement

For developers planning to publish to the Datafye Marketplace:

Scorecards are required — Investors need objective performance data
Standard format — All marketplace algos use the same scoring framework
Transparency — Investors can compare algos fairly
Quality bar — Minimum score thresholds ensure marketplace quality

Key Performance Metrics

Datafye scorecards provide comprehensive trading performance analysis through 18 key metrics organized into five categories. These metrics are automatically calculated during backtesting, paper trading, and live trading, giving you consistent performance measurement across all stages of strategy development and deployment.

The current scorecard is optimized for day-trading style strategies where positions are cleanly entered and exited. A different scorecard for long-term strategies (such as portfolios that are periodically rebalanced) may be introduced in the future.

Datafye scorecards include the following categories of metrics:

Performance Metrics

These metrics provide a high-level view of your strategy's profitability and risk over the trading period:

Trading Days (tradingDays)

The number of calendar days when the market was open during the analyzed period
Counts full or partial trading days, excluding weekends and holidays
Used to normalize other frequency metrics

Cumulative P/L (cumulativePL)

Total profit or loss over the entire trading period
Calculated as: Current portfolio value - Starting portfolio value
Portfolio value = Cash + Market value of positions (using previous close if market is closed)
Important: Scorecards currently require no cash additions or removals during the scored period

Max Drawdown Amount (maxDrawdownAmount)

The largest observed loss from a portfolio value peak to a subsequent trough before a new peak is achieved
Calculated by tracking the highest portfolio value seen (peak), then measuring the largest decline before the next peak
Expressed in currency units (e.g., $2,156.88)
Critical metric for understanding worst-case losses

Max Drawdown Percent (maxDrawdownPercent)

The maximum drawdown as a percentage of the peak portfolio value
Calculated as: (maxDrawdownAmount / Portfolio value at peak) × 100
Easier to interpret than absolute drawdown for comparing strategies of different sizes
Example: 42.15% means the strategy lost 42% from its peak value at worst

Risk-Adjusted Return (riskAdjReturn)

How much profit was generated for every dollar of maximum risk endured
Calculated as: cumulativePL / maxDrawdownAmount
Higher values indicate better risk-adjusted performance
Example: 1.98 means you earned $1.98 for every $1 of maximum drawdown risk

Trade Frequency and Volume

These metrics describe how often your strategy trades:

Total Trades (totalTrades)

Count of trades made over the trading period
A "trade" is defined as a paired entry and exit execution
Important: If a position has multiple entries but a single exit, these count as multiple trades (each entry pairs with the exit)
Helps assess statistical significance and trading costs

Average Trades Per Week (avgTradesPerWeek)

Average number of trades per week
Calculated as: (totalTrades / tradingDays) × 5
Helps classify strategy type (day trading, swing trading, position trading)
Higher frequency = higher transaction costs and slippage impact

Win/Loss Statistics

These metrics break down your trading success rate:

Winning Trades (winningTrades)

Count of trades that generated a profit
Each trade's profit/loss is calculated from the paired entry and exit executions
Breakeven trades (exact $0 P/L) are not counted as winning trades

Losing Trades (losingTrades)

Count of trades that generated a loss
Breakeven trades are not counted as losing trades
Note: winningTrades + losingTrades may be less than totalTrades if there are breakeven trades

Win Rate (winRate)

Percentage of trades that were profitable
Calculated as: (winningTrades / totalTrades) × 100
Example: 72.1% means 72.1% of trades were winners
Important: High win rate doesn't guarantee profitability—must be evaluated with profit/loss ratios

Profit and Loss Analysis

These metrics quantify the profitability of your trades:

Gross Profit (grossProfit)

Total profit generated by all winning trades summed together
Only includes profitable trades (winners)
Example: $19,613.25 in total profits from all winners

Gross Loss (grossLoss)

Total loss incurred by all losing trades summed together
Only includes unprofitable trades (losers)
Typically shown as a negative number (e.g., -$8,840.00)

Profit Factor (profitFactor)

Ratio of gross profits to gross losses
Calculated as: grossProfit / |grossLoss|
Values > 1.0 indicate net profitability
Example: 2.12 means you made $2.12 in profits for every $1.00 in losses
Important: If there are no losing trades (grossLoss = 0), profitFactor is undefined

Profit Per Trade (profitPerTrade)

Average profit per trade across all trades
Calculated as: grossProfit / totalTrades
Includes both winning and losing trades in the average
Example: $5.74 average profit per trade

Average Profit Trade (avgProfitTrade)

Average profit for winning trades only
Calculated as the mean profit across all winning trades
Example: $18.95 average profit when the trade wins

Average Losing Trade (avgLosingTrade)

Average loss for losing trades only
Calculated as the mean loss across all losing trades
Typically shown as a negative number (e.g., -$22.10)
Compared with avgProfitTrade to understand risk/reward per trade

Extreme Values

These metrics identify your best and worst trades:

Best Trade (bestTrade)

The single most profitable trade
Maximum per-trade profit across all winning trades
Example: $298.45 for the best-performing trade
Helps identify if strategy has occasional big wins

Worst Trade (worstTrade)

The single largest losing trade
Minimum (deepest loss) across all losing trades
Typically shown as a negative number (e.g., -$445.20)
Critical for understanding tail risk and maximum single-trade loss

Scorecard Generation

During Backtesting

In Full Stack Foundry, scorecards are automatically generated when you run backtests. The Backtesting Engine continuously updates metrics as your algo executes against historical data:

Run backtest — Execute your algo against historical data using the Backtesting Engine
Track events — Datafye monitors all trades (entries and exits) and portfolio value changes
Update metrics — Scores are calculated incrementally as events occur
Generate report — Complete scorecard is produced when backtest completes

When Metrics Are Calculated

The scorecard metrics are updated at specific events during backtesting, paper trading, and live trading:

When a trade is entered or exited:

Calculate profit/loss on the completed trade(s)
Update trade counts: totalTrades, winningTrades, losingTrades
Update profit/loss totals: grossProfit, grossLoss
Update derived metrics: winRate, profitFactor, avgProfitTrade, avgLosingTrade, profitPerTrade
Update extremes if applicable: bestTrade, worstTrade
Accumulate into cumulativePL
If trade was a loss, update maxDrawdownAmount and maxDrawdownPercent
Update riskAdjReturn

At the end of each trading day:

Increment tradingDays counter
Recalculate avgTradesPerWeek

Optional - During the trading day (for open positions):

Mark-to-market P/L on open positions can be tracked
These unrealized P/L values are temporary and recalculated as market prices change
When position closes, unrealized P/L is replaced by final realized P/L

Example Scorecard

Here's what a complete scorecard looks like in JSON format:

{
  "tradingPerformance": {
    "tradingDays": 68,
    "cumulativePL": 18.9,
    "maxDrawdownAmount": 2156.88,
    "maxDrawdownPercent": 42.15,
    "riskAdjReturn": 1.98,

    "totalTrades": 1583,
    "winningTrades": 1035,
    "losingTrades": 400,
    "winRate": 72.1,

    "profitPerTrade": 5.74,
    "avgProfitTrade": 18.95,
    "avgLosingTrade": -22.10,
    "grossProfit": 19613.25,
    "grossLoss": -8840.00,
    "profitFactor": 2.12,

    "bestTrade": 298.45,
    "worstTrade": -445.20,

    "avgTradesPerWeek": 21.10
  }
}

Scorecard Components

A complete Datafye scorecard includes:

Performance Summary

Key metrics at a glance
Overall score/rating [Does Datafye provide an overall score?]
Pass/fail criteria for marketplace submission

Equity Curve

Visual representation of account value over time
Drawdown visualization
[Chart format: interactive? static?]

Returns Distribution

Histogram of returns
Normal distribution overlay
Tail analysis

Rolling Metrics

Sharpe ratio over time windows
Rolling volatility
[Window sizes used?]

Monthly Returns Heatmap

Performance by month and year
Seasonal patterns
Consistency visualization

Trade Analysis

Win/loss distribution
Hold time distribution
Profit/loss by trade

Drawdown Analysis

All drawdowns ranked
Duration and recovery
Underwater equity chart

Interpreting Scorecards

Understanding what your scorecard metrics mean is crucial for evaluating strategy quality and making informed trading decisions.

Key Metric Relationships

The power of scorecarding comes from understanding how metrics interact:

Profitability Triangle:

cumulativePL shows total profit, but doesn't reveal risk taken
maxDrawdownPercent shows risk endured
riskAdjReturn combines both: higher values mean better risk-adjusted profits

Win Rate vs. Profit Factor:

High winRate (e.g., 70%+) doesn't guarantee profitability if losses are larger than wins
profitFactor must be > 1.0 for overall profitability
Best strategies combine decent win rate with strong profit factor
Example: 72% win rate + 2.12 profit factor = profitable strategy with good consistency

Average Win vs. Average Loss:

Compare avgProfitTrade to avgLosingTrade (absolute value)
Ideally avgProfitTrade > |avgLosingTrade| for positive expectancy
If avgProfitTrade < |avgLosingTrade|, you need higher win rate to be profitable

Good vs. Poor Metrics

profitFactor:

< 1.0: Losing strategy (gross losses exceed gross profits)
1.0 - 1.5: Marginal (barely profitable after transaction costs)
1.5 - 2.0: Good (solid profitability)
> 2.0: Excellent (strong edge)
Warning: If undefined (no losing trades), strategy may be undertested or overfitted

maxDrawdownPercent:

< 15%: Excellent risk control
15% - 25%: Good, manageable risk
25% - 40%: Acceptable if returns justify it
> 40%: High risk, requires strong justification
Important: Consider your personal risk tolerance and capital size

winRate:

Varies significantly by strategy type
Day trading: 55-70% typical
Mean-reversion: 60-75% typical
Trend-following: 35-50% typical (fewer wins, but larger average wins)
Key insight: Win rate alone doesn't determine profitability

riskAdjReturn:

< 1.0: Poor (earned less than maximum risk endured)
1.0 - 2.0: Acceptable
2.0 - 3.0: Good
> 3.0: Excellent risk-adjusted performance

Red Flags

Watch out for these warning signs that may indicate problems with your strategy:

Too-Good-To-Be-True Metrics:

winRate > 85% (suspiciously high, may indicate overfitting or look-ahead bias)
profitFactor > 5.0 (exceptional, verify strategy logic carefully)
maxDrawdownPercent < 5% with high returns (unusual, check for issues)
Undefined profitFactor (no losing trades suggests insufficient testing or overfitting)

High Risk Indicators:

maxDrawdownPercent > 50% (very high risk, most traders can't tolerate this)
Large difference between bestTrade and typical avgProfitTrade (may depend on rare outliers)
Large difference between worstTrade and typical avgLosingTrade (tail risk concern)
avgLosingTrade > avgProfitTrade with winRate < 60% (negative expectancy)

Insufficient Statistical Significance:

totalTrades < 30 (too few trades for reliable statistics)
tradingDays < 60 (insufficient time period, needs more data)
Very low avgTradesPerWeek (< 1.0) combined with short backtest (limited data points)

Execution Concerns:

Very high avgTradesPerWeek (> 100) suggests high-frequency strategy with significant slippage/commission impact
profitPerTrade too small relative to expected transaction costs (strategy may not be profitable after fees)

Marketplace Thresholds

For algos to be accepted in the Datafye Marketplace, they must meet minimum quality standards based on scorecard metrics:

Note: Final marketplace thresholds are subject to change. Contact Datafye for current requirements.

Preliminary minimum requirements:

profitFactor ≥ 1.5 (demonstrates clear edge)
maxDrawdownPercent ≤ 40% (manageable risk for investors)
tradingDays ≥ 120 (sufficient historical validation)
totalTrades ≥ 50 (adequate statistical significance)
cumulativePL > 0 (net profitable over period)

Preferred characteristics:

riskAdjReturn > 2.0 (strong risk-adjusted performance)
winRate between 40% and 85% (reasonable range, not suspiciously high or low)
Consistent performance across different market conditions
Reasonable avgTradesPerWeek (not excessively high-frequency)
Clean trade distribution (no extreme dependence on single outlier trades)

Comparing Algos

Scorecards enable systematic comparison:

Within Your Portfolio

Rank by Sharpe Ratio

Identify best risk-adjusted performers
Allocate more capital to top Sharpe strategies

Diversification Analysis

[Does Datafye provide correlation matrices?]
Combine low-correlation strategies
Portfolio-level metrics

Risk Budgeting

Allocate risk based on volatility
Ensure no single strategy dominates risk

Marketplace Algos

Filter by metrics

[Can users filter marketplace by scorecard metrics?]
Set minimum thresholds
Sort by preferred metrics

Side-by-side comparison

[Compare multiple algos simultaneously?]
Visual comparison charts
Aggregated portfolio impact

Walk-Forward and Out-of-Sample Analysis

Robust scorecarding includes validation beyond in-sample backtesting:

Walk-Forward Analysis

How it works:

Optimize on period 1 (in-sample)
Test on period 2 (out-of-sample)
Roll forward and repeat
Aggregate out-of-sample results

Why it matters:

Reveals if optimization overfits
More realistic performance expectations
Detects regime changes

[Does Datafye's backtesting engine support walk-forward analysis?]

Out-of-Sample Validation

Best practices:

Hold out 20-30% of data for out-of-sample testing
Never optimize on out-of-sample data
Compare in-sample vs. out-of-sample metrics
Significant degradation suggests overfitting

[How does Datafye facilitate out-of-sample validation?]

Scorecarding Best Practices

During Development

1. Start with long backtests

[Minimum recommended period?]
Include multiple market regimes
Bull markets, bear markets, sideways markets

2. Focus on risk-adjusted returns

Don't chase raw returns
High Sharpe is better than high return with high risk
Sustainable strategies have acceptable risk profiles

3. Monitor trade count

Too few = not statistically significant
Too many = potential overfitting
[Recommended range?]

4. Check consistency

Performance should be relatively stable over time
Large variations suggest luck or overfitting

Before Trading

1. Paper trade first

Verify scorecard with live (paper) data
Look for degradation from backtest
Small degradation is normal (slippage, real-world conditions)
Large degradation is a red flag

2. Start small

Don't bet the farm on backtested results
Scale up as forward performance validates backtest
Use kelly criterion or similar for position sizing

3. Set stop conditions

Define when to stop trading an algo
Example: Stop if Sharpe drops below X for Y days
Example: Stop if drawdown exceeds Z%

Advanced Scorecarding

Monte Carlo Simulation

[Does Datafye support Monte Carlo analysis?]

Purpose:

Assess robustness of results
Understand range of possible outcomes
Estimate probability of drawdowns

How it works:

Resample trades with replacement
Generate thousands of alternate histories
Analyze distribution of outcomes

Parameter Sensitivity Analysis

[How does Datafye visualize parameter sensitivity?]

Purpose:

Understand how sensitive strategy is to parameters
Identify robust parameter regions
Avoid parameter overfitting

Approach:

Vary parameters systematically
Score each combination
Visualize parameter space performance
Look for plateaus (robust regions) vs. peaks (overfitting)

Multi-Market Validation

Purpose:

Verify strategy works across different markets
Ensure not curve-fit to specific securities

Approach:

Test on different symbol sets
Test on different timeframes
Test on different asset classes
Consistent performance = robust strategy

Scorecards in the Trading Environment

Scorecarding doesn't stop after backtesting:

Live Performance Tracking

[Does Trading Environment provide live scorecards?]

Continuous scoring:

Track same metrics in live trading
Compare to backtest expectations
Detect when algo stops working

Alerts:

Notify when metrics degrade
Example: Sharpe drops below threshold
Example: Drawdown exceeds limit

Forward Testing

Paper trading scores:

Generate scorecards for paper trading period
Must be close to backtest scores
Large discrepancies require investigation

Live trading scores:

Track live performance
Adjust position sizing based on realized performance
Stop trading if metrics degrade significantly

Next Steps

Now that you understand scorecarding, here's how to put it into practice:

Learn about optimization — Genetic Algorithm Based Optimization shows how to find optimal strategy parameters
Understand backtesting — Backtesting Engine explains how scorecards are generated during strategy validation
See algo descriptors — Algo Descriptors define the configurations that get scored
Start developing — Foundry: Full Stack quickstart walks you through building and scoring your first algo

Last updated: 2025-10-22

PreviousAlgo Descriptors NextThe Backtesting Engine

Last updated 3 months ago

hashtagWhy Scorecarding Matters

hashtagObjective Performance Assessment

hashtagRisk Management

hashtagStrategy Comparison

hashtagMarketplace Requirement

hashtagKey Performance Metrics

hashtagPerformance Metrics

hashtagTrade Frequency and Volume

hashtagWin/Loss Statistics

hashtagProfit and Loss Analysis

hashtagExtreme Values

hashtagScorecard Generation

hashtagDuring Backtesting

hashtagWhen Metrics Are Calculated

hashtagExample Scorecard

hashtagScorecard Components

hashtagInterpreting Scorecards

hashtagKey Metric Relationships

hashtagGood vs. Poor Metrics

hashtagRed Flags

hashtagMarketplace Thresholds

hashtagComparing Algos

hashtagWithin Your Portfolio

hashtagMarketplace Algos

hashtagWalk-Forward and Out-of-Sample Analysis

hashtagWalk-Forward Analysis

hashtagOut-of-Sample Validation

hashtagScorecarding Best Practices

hashtagDuring Development

hashtagBefore Trading

hashtagAdvanced Scorecarding

hashtagMonte Carlo Simulation

hashtagParameter Sensitivity Analysis

hashtagMulti-Market Validation

hashtagScorecards in the Trading Environment

hashtagLive Performance Tracking

hashtagForward Testing

hashtagNext Steps

Why Scorecarding Matters

Objective Performance Assessment

Risk Management

Strategy Comparison

Marketplace Requirement

Key Performance Metrics

Performance Metrics

Trade Frequency and Volume

Win/Loss Statistics

Profit and Loss Analysis

Extreme Values

Scorecard Generation

During Backtesting

When Metrics Are Calculated

Example Scorecard

Scorecard Components

Interpreting Scorecards

Key Metric Relationships

Good vs. Poor Metrics

Red Flags

Marketplace Thresholds

Comparing Algos

Within Your Portfolio

Marketplace Algos

Walk-Forward and Out-of-Sample Analysis

Walk-Forward Analysis

Out-of-Sample Validation

Scorecarding Best Practices

During Development

Before Trading

Advanced Scorecarding

Monte Carlo Simulation

Parameter Sensitivity Analysis

Multi-Market Validation

Scorecards in the Trading Environment

Live Performance Tracking

Forward Testing

Next Steps