Operator runbook

From backtest to potentially funded account: the eval deployment runbook

A backtest-passing bot is not a money-making bot — yet. This runbook is the bridge. Six stages, ~3 weeks, ~$50-150 in eval costs, and a real shot at a funded account at the end. Read this once. Bookmark it. Re-read at each stage transition.

1
Backtest
automated
2
Validation
~30 min
3
Paper
2 weeks
4
Eval
7-30 days
5
Funded
ongoing
1

Backtest validation

The fleet backtester runs all 31 strategies × 14 symbols × 4 timeframes against multiple years of CME historical OHLCV data. Each (strategy, symbol, variant) combo gets classified into one of four buckets per the dual-track gate logic.

The 4 promotion tracks

  • ELITE — passes both EVAL and FUNDED gates. The holy grail. WR ≥ 55%, PF ≥ 1.30, ≥ 100 trades, max DD ≤ $2,000.
  • EVAL — tuned for prop-firm evaluations. Same gate values; ELITE is just EVAL + FUNDED both.
  • FUNDED — scaling track. WR ≥ 30%, PF ≥ 1.10, ≥ 200 trades, max DD ≤ $4,500.
  • UNCLASSIFIED — failed both. Strategy Improver candidates.

Run a backtest

node scripts/run-fleet-backtest.js --tf 5 --micro

The --micro flag uses micro contract pointValues (MNQ $2/pt vs NQ $20/pt) so DDs scale to micro-account-friendly sizes. Output lands in Obsidian Vault\04-Agent-Reports\fleet-backtests\<date>\5m-micro\ as one JSON per bot + a SUMMARY.md.

What success looks like at this stage: SUMMARY.md classifies 5+ ELITE bots across 3+ different symbols. If only 1 symbol shows ELITE bots (e.g., only M2K), you have a concentration problem to fix before risking eval money.
2

Read the VALIDATED-SURVIVORS report

The moment SUMMARY.md drops, the post-v6 validation pipeline auto-runs. It chains three independent gates and produces the actionable survivor list:

  1. ELITE classification — done by step 1
  2. Topstep trailing-DD simulator — replays each bot's trade sequence under Topstep's actual trailing-DD rules ($2,500 trailing on 50K). Catches bots that pass peak-to-trough DD but blow trailing.
  3. Walk-forward TRAIN/TEST — TRAIN window (2021-2024) vs TEST window (2024-2026). Pass = TEST PF ≥ 80% of TRAIN PF. Catches overfitting.

Output lands at:

Obsidian Vault\04-Agent-Reports\fleet-backtests\<date>\5m-micro\ ├── VALIDATED-SURVIVORS.md ← human-readable: read this └── VALIDATED-SURVIVORS.json ← API consumes this for the in-app checklist

The SURVIVORS section at the top is the actionable list. These bots passed all 3 backtest-side gates. They're safe to paper-trade — but not yet safe to risk real eval money on. Paper trading is the next gate.

If 0 survivors: Don't proceed. The strategies are overfit to the specific 5y window. Tighten time-stops, ATR multipliers, or session filters and re-run the backtest. Usually a single retune cycle produces 1-3 survivors.
3

Two weeks of paper trading

Paper trading is where backtest assumptions meet real Tradovate market data + slippage + execution latency. This is the gate that catches overfit + unrealistic-fill cases that look great on paper but die in production.

Day 1 setup

  1. Open Lucid Terminal → Brokers tab
  2. Click + Add Tradovate Firm → connect your Tradovate paper account via OAuth
  3. Verify the paper balance shows up on the Dashboard equity card
  4. Open Bots → Company Desk
  5. The Eval Pre-Flight Checklist should now show your survivors with green ELITE/DD/WF pills and a yellow PAPER pill (0/14 days)
  6. Click To Paper on each survivor → opens the Deploy modal pre-configured
  7. Pick the Tradovate paper account → set contracts to 1 → leave session as "All sessions" for the first run → click Deploy

What to watch during the 2 weeks

  • Active Deployments view — live trade count, WR, P&L, DD per bot
  • Audit log — click Audit on any card to see every fill with timestamp + IP
  • Daily comparison — at end of each session, compare paper realized stats to backtest stats

Drop criteria

Drop any bot showing any of these symptoms:

  • WR has degraded by >30% from backtest (e.g., backtest 60% → paper 40%)
  • PF has degraded by >25% from backtest (backtest 1.40 → paper 1.05)
  • Any single day where DD exceeded $500 (1ct micros)
  • Bot fired at unexpected times despite session filter (logic bug)
Common pitfall: the paper environment doesn't fully simulate margin checks or contract limits. A bot that looks fine in paper might still get rejected fills on a real eval account. The 2-week window is enough to see slippage drift but won't catch capital-rule violations.
4

Pre-flight checklist — before spending eval money

You're about to spend $50-150 on a Topstep 50K combine. Run this checklist before clicking buy. Print it out if that helps. Every box must be .

If any box is unchecked: stop. Do that work first. Each unchecked box is a ~10-20% increase in your blow-the-eval probability.
5

The paid eval (Topstep 50K combine)

Buy the combine

  1. Go to topstep.com → Trading Combine
  2. Pick the 50K combine. Look for promo codes (~30% off is common) before checkout.
  3. Once paid, you get fresh OAuth credentials for that eval account
  4. Connect the new account via Brokers → + Add Tradovate Firm (Topstep is a Tradovate-tenant firm, uses same OAuth flow)

Deploy your portfolio to the eval account

  1. Open Bots → Company Desk
  2. For each survivor that passed the pre-flight: To Paper → Deploy but with the EVAL ACCOUNT selected (not the paper one)
  3. Verify the deployment lands as PAPER mode (default)
  4. Now LIVE on each one — type the literal word LIVE to confirm — this flips the bot to live trading on the eval account

Daily routine during the eval

  • Morning (before NY open): check Active Deployments — every bot active? Risk caps still set?
  • Midday: glance at the Dashboard — current P&L vs $3,000 target, current trailing DD floor
  • End of session: check the Audit Log for any anomalies (rejected orders, risk-cap fires)
  • If you're up > $1,500: consider closing ALL positions for the day. Trailing DD floor is now ~$1K below your peak. Going home flat preserves the locked-in gains.
The most expensive moment: when you're up $2,500+ and the next trade pushes you to $3,000 (passing). Close everything immediately. Don't wait for the bot to find another setup. Trailing DD is highest when peak is highest, and a $1,500 give-back from $2,500 = blown.

If you blow the eval

  1. Don't panic-buy a second eval that same day
  2. Pull up the Audit Log + the trade list — find the trade that did the most damage
  3. Was it a single bad fill? A specific bot? A news event?
  4. Drop the offending bot, retune the survivors, retry next week with attempt #2
6

Promotion to LIVE on a funded account

Eval passed → Topstep gives you a funded account. Time to scale safely.

The first week of funded

  • Same exact config as the eval — same bots, same contracts, same caps. Don't change anything that worked.
  • Connect the funded account via the same OAuth flow
  • Replicate the eval-account deployments to the funded account
  • Watch for 5 trading days minimum before scaling contract size

Scaling rules

  • Don't scale contract size for 30 trading days. Stay at 1ct on micros. The drawdown rules don't care if you're up $5K — one bad day at 5ct can wipe a month's gains.
  • Scale contract count before scaling per-bot risk caps. 2ct of a proven bot beats 1ct with looser stops.
  • Pull profits weekly. Topstep allows withdrawals after $1K minimum profit. Pull early, pull often — the trailing DD never goes away.

Multi-account scaling

  • Once one funded account is producing $200+/day consistently for 30 days, buy a second 50K eval
  • Same portfolio, different account → 2x your daily without 2x your risk per bot
  • Topstep allows 5 funded accounts max per individual
  • From there, additional capital comes from sister firms (Apex, MFF, Tradeify, Bulenox) — all run on Tradovate, all work with Lucid Terminal's existing OAuth flow
Long-term goal: 3-5 funded accounts, each producing $200-500/day, total $1,000-2,500/day. Sustainable, scalable, with the audit log + risk caps + kill switch all wired in. This is the machine you're building.

Troubleshooting + when to abort

"My survivors are all the same symbol"

Concentration risk. Don't deploy 5 M2K bots to 5 accounts — when M2K has a bad day all 5 accounts blow at once. Either retune to broaden the symbol set, or accept you're running 1 effective bot and only buy 1 eval.

"Walk-forward shows TEST PF < 50% of TRAIN PF"

Severe overfitting. The strategy was tuned to in-sample noise. Don't paper-trade this bot — retune with looser parameters and re-run the full backtest pipeline.

"Paper trading shows positive WR but negative P&L"

Slippage is eating your edge. Tighten the TP target by 1-2 ticks, or widen the SL by 1 tick, and re-test. If both adjustments still show negative P&L, the strategy doesn't have enough edge to survive realistic execution costs.

"Eval blew on day 1"

Almost always due to oversized initial position or news event. Audit log shows the death trade. Check: was it a session-filter bypass (bot fired during news window)? Was it a contract-size mismatch (deployed at 5ct instead of 1ct)? Fix the specific cause, not the strategy.

"Eval blew at +$2,000"

Trailing DD bit you. You hit a peak then gave it back. Standing rule: when up $1,500+ on a 50K combine, close all positions for the day. Trailing DD is highest at peak. Going flat preserves the lead.

When to abort the whole project

  • 3 consecutive evals blown the same way → strategy fleet is fundamentally broken, not edge cases
  • Walk-forward TEST/TRAIN ratio is < 0.50 across all surviving bots → systemic overfitting
  • Paper trading shows negative aggregate P&L over 30+ days → no real edge, just noise

Aborting means: shut down all live deployments, pause development for 1-2 weeks, reassess. Maybe the market regime has shifted. Maybe the strategies need a complete rewrite. Either way, throwing more money at a broken pipeline doesn't fix the pipeline.

Using NinjaTrader alongside Lucid Terminal

Many of our funded traders use NinjaTrader as their order-entry surface for the eval & funded stages above. Lucid Terminal continues to handle indicator tracking, journal-to-vault, bot automation, and combined-DD monitoring; NinjaTrader handles the discretionary order placement with its DOM, advanced order types, and chart trader. The two coexist cleanly because both read the same broker account state through each firm’s official API.

If you don’t already use NinjaTrader, you can pair it with a free Kinetick data account for futures-only feeds during your eval stage. See the Platforms page for the partner links.

Ready to start?

Every step in this runbook is wired into the Lucid Terminal app. Open the Bots tab, find the Eval Pre-Flight Checklist, and let the gate badges guide you stage by stage.

Open Lucid Terminal Back to Getting Started
Risk Disclosures

Risk Disclosure: Futures and forex trading contains substantial risk and is not for every investor. An investor could potentially lose all or more than the initial investment. Risk capital is money that can be lost without jeopardizing one’s financial security or lifestyle. Only risk capital should be used for trading and only those with sufficient risk capital should consider trading. Past performance is not necessarily indicative of future results.

Hypothetical Performance Disclosure: Hypothetical performance results have many inherent limitations, some of which are described below. No representation is being made that any account will or is likely to achieve profits or losses similar to those shown; in fact, there are frequently sharp differences between hypothetical performance results and the actual results subsequently achieved by any particular trading program. One of the limitations of hypothetical performance results is that they are generally prepared with the benefit of hindsight. In addition, hypothetical trading does not involve financial risk, and no hypothetical trading record can completely account for the impact of financial risk of actual trading. For example, the ability to withstand losses or to adhere to a particular trading program despite trading losses are material points that can also adversely affect actual trading results. There are numerous other factors related to the markets in general or to the implementation of any specific trading program that cannot be fully accounted for in the preparation of hypothetical performance results and all which can adversely affect trading results.

Testimonials Disclosure: Testimonials appearing on this website may not be representative of other clients or customers and are not a guarantee of future performance or success. Live Trade Room Example: This presentation is for educational purposes only and the opinions expressed are those of the presenter only. All trades presented should be considered hypothetical and should not be expected to be replicated in a live trading account.

Virtual Currency: View CFTC advisories as they contain more information on the risks associated with trading virtual currencies.