Skip to content

JayGLXR/FinOptExplorer

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

FinOptExplorer

A fun benchmarking framework for testing memory allocators, numeric representations, and SIMD optimizations in the context of high-performance financial systems.

Components

1. Order Book Engine (order_book_bench)

A limit order book implementation with swappable memory allocators:

  • Standard: Uses std::allocator (malloc/free)
  • Pool: Fixed-size block allocator with O(1) alloc/free
  • Arena: Bump allocator with bulk deallocation

Features:

  • Price-time priority matching
  • Configurable numeric representations (int64, double, fixed-point)
  • Overflow handling modes (undefined, checked, saturating, widening)

2. Risk Aggregation Engine (risk_engine_bench)

Portfolio Greeks computation with multiple implementation strategies:

  • Scalar Naive: Simple loop
  • Scalar Unrolled: 4x loop unrolling
  • Kahan Summation: Compensated summation for accuracy
  • Pairwise: Divide-and-conquer summation
  • AVX2: 256-bit SIMD with FMA
  • AVX-512: 512-bit SIMD
  • AVX2 + Kahan: SIMD with compensation

Includes DAZ/FTZ (Denormals-Are-Zero/Flush-To-Zero) mode testing.

3. Feed Parser (feed_parser_bench)

Binary protocol parser with fault injection for robustness testing:

  • Simulated ITCH-style market data protocol
  • Length validation and overflow protection
  • Sequence gap detection
  • Fault injection: bit flips, truncation, invalid lengths/types

4. Allocator Benchmark (allocator_bench)

Pure allocator performance comparison:

  • Pool Allocator: Fixed-size blocks, free list
  • Arena Allocator: Bump pointer, bulk free
  • Slab Allocator: Object-specific caching
  • Size-Class Allocator: jemalloc-style size classes
  • Thread-Safe Pool: Lock-free concurrent allocator

Building

Prerequisites

  • CMake 3.16+
  • C++20 compiler (GCC 10+, Clang 12+)
  • Optional: AVX2/AVX-512 capable CPU

Build Commands

# Standard build
mkdir build && cd build
cmake -DCMAKE_BUILD_TYPE=Release ..
make -j$(nproc)

# With sanitizers (for debugging)
cmake -DCMAKE_BUILD_TYPE=Debug -DENABLE_SANITIZERS=ON ..
make -j$(nproc)

# With AVX-512
cmake -DCMAKE_BUILD_TYPE=Release -DENABLE_AVX512=ON ..
make -j$(nproc)

Running Benchmarks

Interactive Mode

# Order book benchmark
./order_book_bench

# Risk engine benchmark
./risk_engine_bench

# With DAZ/FTZ enabled
./risk_engine_bench --daz-ftz

# Subnormal stress test
./risk_engine_bench --subnormal

# Cancellation stress test
./risk_engine_bench --cancellation

# Feed parser benchmark
./feed_parser_bench

# Allocator benchmark
./allocator_bench

JSON Mode (for Python harness)

echo '{"config": {"allocator": "pool", "operation_count": 100000}}' | ./order_book_bench --json

echo '{"config": {"position_count": 100000, "daz_ftz": true}}' | ./risk_engine_bench --json

Python Benchmark Harness

cd python
pip install -r requirements.txt

# Run all benchmarks
python benchmark_harness.py --build-dir ../build

# Run specific benchmark
python benchmark_harness.py --test allocator
python benchmark_harness.py --test orderbook --operations 500000
python benchmark_harness.py --test risk --positions 1000000

# Property-based tests
python property_tests.py

Numeric Configurations

Integer Overflow Handling

enum class OverflowHandling {
    UNDEFINED,        // Let it wrap (UB in C++)
    CHECKED_ABORT,    // Abort on overflow
    CHECKED_SATURATE, // Clamp to INT64_MAX/MIN
    WIDENING,         // Use __int128 intermediate
};

Price Representations

enum class PriceRepresentation {
    INT64_TICKS,       // Integer tick counts
    DOUBLE_NATIVE,     // IEEE 754 double
    DOUBLE_DAZ_FTZ,    // Double with denormals flushed
    FIXED_POINT_32_32, // 32.32 fixed point
};

Key Experiments

1. Allocator Performance

Compare allocation latency across different strategies:

Small Objects (64 bytes):
  malloc/free:      X.XX ms    Y.YYe+07 ops/s
  Pool:             X.XX ms    Y.YYe+08 ops/s  (10x faster)
  Arena:            X.XX ms    Y.YYe+09 ops/s  (100x faster)

2. SIMD vs Scalar

Compare throughput and accuracy:

Method          Time (μs)    Throughput      Rel. Error    ULP
scalar_naive         XXX     X.XXe+08 /s     X.XXe-16        0
avx2                 XXX     X.XXe+09 /s     X.XXe-16        0
kahan                XXX     X.XXe+08 /s     0.00e+00        0

3. Floating-Point Stress Tests

  • Subnormal Performance: Compare with/without DAZ/FTZ (10x+ speedup typical)
  • Cancellation: Alternating large positive/negative values reveal precision loss
  • Accumulation: Long summations show drift in naive implementations

4. Integer Overflow Scenarios

Design sequences where:

  • Total quantity overflows 32-bit
  • Price × quantity overflows 64-bit
  • Compare checked vs unchecked behavior

Architecture Notes

Memory Layout

Structures are cache-line aligned (64 bytes) where beneficial:

struct alignas(64) Position {
    double quantity;
    double delta_per_unit;
    // ...
};

Pool Allocator Design

┌─────────────────────────────────────────┐
│  Free List Head → Block → Block → NULL  │
├─────────────────────────────────────────┤
│  [Block 0] [Block 1] [Block 2] ...      │
│  Fixed size, contiguous memory          │
└─────────────────────────────────────────┘

Arena Allocator Design

┌─────────────────────────────────────────┐
│  Memory Region                          │
│  [████████████████░░░░░░░░░░░░░░░░░░░]  │
│                   ↑                     │
│              Bump Pointer               │
│  No individual free, bulk reset only    │
└─────────────────────────────────────────┘

Expected Results

Order Book Throughput (M ops/sec)

Allocator 10K ops 100K ops 1M ops
Standard ~1.5 ~1.2 ~1.0
Pool ~3.0 ~2.5 ~2.0
Arena ~5.0 ~4.0 ~3.0

Risk Engine Throughput (positions/sec)

Method 10K pos 100K pos 1M pos
scalar ~50M ~50M ~50M
avx2 ~200M ~200M ~200M
avx512 ~400M ~400M ~400M
kahan ~25M ~25M ~25M

License

MIT License - See LICENSE file

About

A fun benchmarking framework for testing memory allocators, numeric representations, and SIMD optimizations in the context of high-performance financial systems.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors