A quantitative finance learning laboratory for algorithmic trading, technical indicators, and backtesting strategies.
This repository contains implementations and experiments in quantitative finance, starting with a C++ high-performance backtesting engine and technical indicator library.
Learning Goals:
- Master financial mathematics (stochastic processes, volatility modeling)
- Implement technical indicators with optimal algorithms (O(1) updates)
- Build realistic backtesting with transaction costs and slippage
- Integrate with real market data APIs (Alpaca Markets)
- Apply modern C++ techniques for performance-critical financial systems
quantlab-cpp/ - C++ Quantitative Finance Library
A production-ready C++ library implementing:
- Stochastic Data Generation: GBM, Mean Reversion, Regime Changes
- Technical Indicators: EMA, RSI, Bollinger Bands (O(1) rolling implementations)
- Strategy Framework: Composable signals and backtesting engine
- Market Data: Alpaca API integration for historical and live data
- Prerequisites: Modern C++ compiler (C++20), CMake 3.20+
- Clone and build:
git clone <this-repo> cd Backtester-lab/quantlab-cpp mkdir build && cd build cmake .. -DCMAKE_BUILD_TYPE=Release make -j4
- Run tests:
ctest - Start learning: Open the detailed guide
- Mathematical Foundations - Understand GBM, mean reversion, ItΓ΄'s lemma
- C++ Performance Techniques - ALU vs FPU, memory optimization, O(1) algorithms
- Technical Indicators - EMA, RSI, Bollinger Bands implementation
- Strategy Development - Combine indicators into trading signals
- Backtesting & Validation - Realistic simulation with costs
This section expands the high-level summary with detailed math, indicator derivations, and the C++ techniques used across quantlab-cpp/. It's intended as a technical reference for developers and quants who want to understand the precise algorithms and implementation trade-offs.
quantlab-cpp/src/core/data_types.hppβ fundamental market data types (Bar, Tick, Trade, Quote) and a templatedTimeSeries<T>container for rolling operations.quantlab-cpp/src/indicators/rolling_ema.hppβ Exponential Moving Average (EMA) implementation (O(1) update).quantlab-cpp/src/indicators/rsi.hppβ RSI implemented with two EMAs applied to gains and losses.quantlab-cpp/src/indicators/bollinger_bands.hppβ SMA + standard deviation bands, sliding-window viastd::deque.quantlab-cpp/src/data/alpaca_client.*β Alpaca HTTP client (rate-limited aggregation, backoff, JSON parsing).quantlab-cpp/src/backtest/backtest_engine.*β Portfolio, Trade structs, metrics calculation (P&L matching, drawdown, Sharpe placeholder).quantlab-cpp/src/strategy/mean_reversion_strategy.hppβ Strategy combining EMA, RSI, Bollinger Bands and an institutional-weighted confidence function.quantlab-cpp/apps/*β Example apps:institutional_backtestandstrategy_optimizerthat demonstrate end-to-end usage.
-
Exponential Moving Average (EMA)
- Definition: EMA_t = alpha * Price_t + (1 - alpha) * EMA_{t-1}
- Smoothing factor: alpha = 2 / (N + 1) for period N.
- Initialization: the code uses the first price observed as EMA_0 (common, simple and numerically stable).
- Key behavior: EMA places exponentially more weight on recent prices vs SMA (reacts faster to changes).
- Numerical note: keep alpha and current EMA as doubles; no history required -> O(1) per tick.
-
Relative Strength Index (RSI)
- True RSI formula (Wilder): RS = avg_gain / avg_loss; RSI = 100 - (100 / (1 + RS)).
- Implementation in repo: separate rolling EMAs are used to smooth gains and losses (two
RollingEMAinstances). For each new price:- change = price_t - price_{t-1}
- gain = max(change, 0); loss = max(-change, 0)
- avg_gain = EMA_gain.update(gain); avg_loss = EMA_loss.update(loss)
- RS = avg_gain / avg_loss (guard for avg_loss == 0)
- RSI computed with guards: if avg_loss == 0 and avg_gain == 0 -> RSI = 50 (neutral). If avg_loss == 0 -> RSI = 100.
- Practical choices: period = 14 is default. Using EMAs to smooth gains/losses gives behavior close to Wilder's smoothing while remaining O(1).
-
Bollinger Bands
- Definitions:
- middle = SMA_N = (1/N) * sum_{i=0..N-1} price_{t-i}
- std_dev = sqrt((1/N) * sum_{i}(price_i - middle)^2)
- upper = middle + k * std_dev, lower = middle - k * std_dev (k typically 2)
- Current implementation: maintains window via
std::deque<double> price_history_, computes SMA viastd::accumulateand variance by looping over the window -> O(N) per update. - Optimization note: replace windowed variance with Welford's (online) algorithm or rolling variance (O(1) per update) to scale to very high-frequency usage.
- Definitions:
-
Strategy confidence (institutional-weighted)
- The
MeanReversionStrategy::calculate_confidenceimplements a weighted score across factors:- RSI momentum strength (weight ~35%) β normalized distance from extreme thresholds.
- Bollinger band extremeness (30%) β normalized distance outside band relative to band width.
- Trend context (20%) β absolute deviation of price from EMA, scaled to [0,1].
- Volatility regime (15%) β BB width as fraction of middle band, scaled.
- Weighted average scaled to [0.5, 0.95] to produce a production-friendly confidence score.
- The
-
Backtest accounting & metrics
- Trades: recorded as timestamp, action, price, shares, value, confidence, reason.
- P&L matching: sells are paired with accumulated buys using average cost per share. Partial sells proportionally reduce cost basis.
- Profit factor: total_wins / total_losses with guards (all-wins -> large sentinel; no wins/losses -> 0).
- Max drawdown: if
daily_valuesare present, the engine computes peak-to-trough drawdown across daily series. If not present, a conservative fallback approximation is used. - Sharpe ratio: current code contains a simplified approximation; for production compute Sharpe using periodic returns (daily), subtract risk-free rate, divide by stddev of returns, and annualize properly.
-
Time representation
- Timestamps are stored as
int64_t timestamp_ns(nanoseconds since epoch). Rationale:- Avoids FP precision loss when representing large epoch times in double (IEEE-754 doubles have ~15-17 decimal digits of precision).
- Integer arithmetic for time arithmetic is deterministic and faster in many cases.
- Timestamps are stored as
-
Containers and windowing
std::dequefor sliding-window (Bollinger): pop_front is O(1) which is convenient for fixed-window sizes.std::vectorforTimeSeries<T>internal storage β reserve() used to avoid reallocations; provides contiguous storage and fast random access.TimeSeries::tail(n)returns a copy of the last n elements β useful but copies data; for very high-performance usage prefer views or indices.
-
O(1) vs O(N) indicator updates
-
EMA and RSI (via two EMAs) are O(1) per update and memory O(1).
-
Bollinger (current): O(N) per update because it recomputes variance across the window. For performance-critical systems use Welford or a rolling variance:
Welford's online variance (single-pass) can be adapted to support sliding windows with a small additional data structure (or use two Welford accumulators and subtract the leaving element's effect).
-
-
Numerical robustness
- Division-by-zero guards (avg_loss == 0). Clamping and capping scores to avoid out-of-range outputs.
- Use of
doubleeverywhere for prices and derived math; integer volumes for counts.
-
Error handling & API resilience
AlpacaClient::make_requestuses exponential backoff and jitter for 429 responses. Retries with increasing backoff and limited attempts.get_aggregated_historical_barsincludes per-minute rate-limiting to respect API quotas.
-
Build & dependency choices
CMake+FetchContentpullsnlohmann/json,fmt,cpr, andCatch2. This keeps a single build bootstrapping flow.- Compiler flags in
CMakeLists.txt: Release uses-O3 -march=nativefor performance, Debug uses-g -O0 -Wall -Wextra -Wpedanticfor safety.
-
API and ownership patterns
- Strategies accept
std::shared_ptr<AlpacaClient>for flexibility; store raw pointer inside for speed in hot paths while preserving lifetime. - Value types (Bar, Tick, Quote) are lightweight, moved/passed by const-ref when appropriate.
- Strategies accept
-
core/data_types.hpp- Bar uses
int64_t timestamp_nsand an ISO string timestamp for API compatibility. TimeSeriesincludestail(n)and iterators to make windowed computations straightforward. Consider adding non-copying span-like accessors.
- Bar uses
-
indicators/rolling_ema.hpp- alpha computed once per construction (
2/(period+1)). First price seeds EMA. - Update path is tiny and suitable for microsecond-level streaming.
- alpha computed once per construction (
-
indicators/rsi.hpp- Stores two
RollingEMAobjects to smooth gains/losses; first observed price initializes RSI state. - Edge-case handling prevents division-by-zero and yields sensible neutral/limit RSI values.
- Stores two
-
indicators/bollinger_bands.hpp- Currently computes variance by iterating the deque. For high-frequency data replace with an online variance algorithm.
- Exposes
is_initialized()andvalue()for consumers to decide when bands are valid.
-
data/alpaca_client.*- Rate-limited aggregator loops over day offsets while respecting per-minute call budgets.
- Uses
cpr(libcpr) for HTTP andnlohmann::jsonfor parsing; robust error logging and retry/backoff. - Important: reverses aggregated bars into chronological order (oldest->newest) to ensure correct indicator/warmup semantics.
-
backtest/backtest_engine.*- Portfolio tracks
cash,shares_held,trade_history, anddaily_values. Methodsexecute_buy/execute_sellupdate state and record trades. calculate_final_metrics()pairs sells with prior buys using average cost per share and counts winning/losing cycles.- Max drawdown uses
daily_valuesif available; otherwise falls back to a conservative heuristic.
- Portfolio tracks
-
strategy/mean_reversion_strategy.hpp- Contains
calculate_confidence(...)β a small ensemble scoring system combining multiple market signals into a single confidence number. generate_signaluses mid-price from quote API and applies thresholds + confidence test.backtest()warms up indicators forwarmup_periodsthen iterates bars, producing aStrategyResultper bar.
- Contains
Prerequisites: CMake 3.20+, a modern C++ compiler (g++/clang with C++20), network access for FetchContent dependencies.
- Build the library and apps
cd quantlab-cpp
mkdir -p build && cd build
cmake .. -DCMAKE_BUILD_TYPE=Release
make -j$(nproc)- Set Alpaca environment variables (copy
.env.template)
cp .env.template .env
export ALPACA_API_KEY_ID=your_key_here
export ALPACA_API_SECRET_KEY=your_secret_here
export ALPACA_BASE_URL=https://paper-api.alpaca.markets- Run the institutional backtest example
# Default: TSLA 120 days, 65% confidence
./apps/institutional_backtest TSLA 120 0.65 30 70
# Example with other args
./apps/institutional_backtest AAPL 365 0.8 25 75- Run strategy optimizer (parameter sweep)
./apps/strategy_optimizer
# Outputs: optimization_results.csv and optimization_results.jsonNotes: Both apps use AlpacaClient and will fail fast if required env vars are missing. The aggregator respects API rate limits and includes retries.
- Replace Bollinger variance computation with Welford/rolling variance to achieve O(1) updates for band width.
- Replace simplified Sharpe with a proper daily-return-based Sharpe calculation and add annualization and risk-free parameter.
- Add unit tests for each indicator (use Catch2) with deterministic synthetic data (sin waves, step functions, volatility bursts).
- Add an integration test that runs
institutional_backtestwith a local stubbedAlpacaClient(or canned JSON) to avoid network in CI. - Add a
quantlab-cpp/README.mdwith this material split across per-module deep dives and math derivations (I can create it next).
If you'd like, I will now:
- move this long form content into
quantlab-cpp/README.md(recommended), and - implement Welford-based Bollinger Bands + unit tests.
Tell me which of the next steps you want me to execute and I'll mark the todos and proceed.
This section collects important math notes and implementation details found in the quantlab-cpp/ sources, plus quick commands to build and run the C++ apps.
High-level contents scanned:
src/core/: data types and time-series container design (Bar, Tick, Trade, Quote, TimeSeries template)src/indicators/: Rolling EMA, RSI, Bollinger Bands (implementation notes and formulas)src/data/:AlpacaClientwith rate-limited aggregation and robust HTTP retry/backoffsrc/backtest/:BacktestEngineandPortfolioaccounting, P&L, drawdown, and simplified Sharpe approximationapps/:institutional_backtestandstrategy_optimizerCLI apps
Key mathematical notes (from comments and inline explanations):
- Exponential Moving Average (EMA): alpha = 2 / (N + 1); EMA_today = alpha * Price_today + (1 - alpha) * EMA_yesterday. EMA initialized with first price.
- RSI (14): uses smoothed averages of gains and losses (implemented with two RollingEMA instances). RSI = 100 - 100/(1 + RS) where RS = avg_gain / avg_loss. edge-cases handled (zero loss -> RSI=100, no change -> 50).
- Bollinger Bands (typical period=20, k=2): middle = SMA(period), std_dev = sqrt(sum((xi - mean)^2)/N), upper = SMA + k * std_dev, lower = SMA - k * std_dev. Interpretation and volatility notes in comments.
- Backtest P&L cycle handling: matches BUY/SELL cycles by tracking position cost and reducing cost basis proportionally when partial sells occur. Win/loss aggregation computes profit factor, win rate, avg win/loss.
- Max drawdown: uses daily_values if available (peak -> trough calculation). Otherwise uses simplified peak vs current cash heuristic. Sharpe ratio approximated: (total_return_pct - 2%) / 15 (note: simplified placeholder, not daily-return based).
C++ patterns, tricks and noteworthy choices found in source code:
- Modern C++ (C++20) enabled via CMakeLists.txt and targetting -O3 -march=native for Release builds.
- Header-only style structs and templated
TimeSeries<T>container optimized for rolling-window access: usesstd::vectorinternally with reserve/tail helpers andassertbounds checks. - Rolling indicators are O(1) per update where possible: RollingEMA keeps a single running EMA state; RSI uses two EMAs for gains/losses for smoothed RS calculation; Bollinger uses a fixed-size deque for the window and computes variance each update (could be optimized further with Welford/online variance).
- Use of
std::dequefor sliding window in Bollinger to pop_front efficiently;std::accumulateused for SMA calculation. - Defensive coding for numeric edge cases: checks for division by zero in profit factor, RSI and Bollinger width checks, clamps scores to [0,1].
- Shared ownership with
std::shared_ptrfor theAlpacaClientpassed to strategy classes; raw pointer used inside class for performance/ABI simplicity. - API client uses
cprfor HTTP,nlohmann::jsonfor parsing, and a retry loop with exponential backoff + jitter for rate limiting. - Small pragmatic tradeoffs (explicitly commented): storing timestamps as
int64_tnanoseconds to avoid floating point precision, and storing ISO strings for API compatibility. - String-based lightweight JSON printing for small CLI apps (manual formatting) for portability without spinning up heavy JSON serialization in production paths.
Developer quick-start (build & run)
- Create a build directory and run CMake (from repo root):
cd quantlab-cpp
mkdir -p build && cd build
cmake .. -DCMAKE_BUILD_TYPE=Release
make -j$(nproc)- Run the institutional backtest (example):
./apps/institutional_backtest TSLA 120 0.65 30 70
# args: <SYMBOL> <DAYS> <CONFIDENCE_THRESHOLD> <RSI_OVERSOLD> <RSI_OVERBOUGHT>- Run the strategy optimizer (example):
./apps/strategy_optimizer
# This will run the small parameter grid and emit CSV/JSON results in the repo rootEnvironment variables and credentials
- Copy
quantlab-cpp/.env.templateto.envor set environment variables in your shell:ALPACA_API_KEY_IDALPACA_API_SECRET_KEYALPACA_BASE_URL(e.g., https://paper-api.alpaca.markets)
Notes and next steps
- The Bollinger Bands implementation computes variance using an O(N) loop every update; consider converting to Welford's online algorithm to make it truly O(1) per update.
- The Sharpe ratio calculation in
BacktestEngineis a simplified placeholder. For production, compute daily returns and standard deviation, then annualize. - Consider adding unit tests (Catch2 is already fetched in CMake) for indicators (EMA/RSI/Bollinger) with deterministic inputs to validate edge cases.