High-Performance Bitcoin Puzzle Solver
Optimized for Apple Silicon & NVIDIA CUDA
Features โข Quick Start โข CUDA โข Examples โข Performance
Keyhunt is a specialized tool for solving Bitcoin Puzzle Transactions - a series of increasingly difficult challenges with ~1000 BTC in prizes. This version is heavily optimized for:
- Apple Silicon (M1/M2/M3/M4) - Unified memory + powerful cores
- NVIDIA CUDA - Massively parallel 32-bit operations
Why 32-bit chunks on 64-bit hardware?
The secp256k1 curve uses 256-bit integers. We break them into 8 ร 32-bit limbs:
256-bit key = [limb0][limb1][limb2][limb3][limb4][limb5][limb6][limb7]
32 32 32 32 32 32 32 32
Benefits:
| Platform | Why 32-bit is Faster |
|---|---|
| Apple Silicon | Better register utilization, efficient carry chains |
| NVIDIA CUDA | GPUs have 2-4x more 32-bit ALUs than 64-bit |
| Both | Enables range halving optimizations |
| Feature | Description |
|---|---|
| ๐ BSGS Algorithm | Baby Step Giant Step - reduces O(n) to O(โn) |
| ๐ธ Bloom Filters | 3-level cascade for lightning-fast lookups |
| ๐ Endomorphism | Curve trick for 2-3x speedup |
| ๐งต Multi-threaded | Scales across all CPU cores |
| ๐ฎ CUDA Support | Offload to NVIDIA GPUs (NEW!) |
| ๐พ Checkpointing | Save/resume long searches |
# Install dependencies
brew install cmake openssl@3 gmp
# Clone and build
git clone https://github.com/consigcody94/keyhuntM1CPU.git
cd keyhuntM1CPU
cmake -B build -DCMAKE_BUILD_TYPE=Release
cmake --build build -j$(sysctl -n hw.ncpu)
# Hunt! ๐ฏ
./build/keyhunt -m bsgs -f tests/66.txt -b 66 -t 8 -R# Install dependencies
sudo apt install cmake libssl-dev libgmp-dev nvidia-cuda-toolkit
# Build with CUDA
cmake -B build -DCMAKE_BUILD_TYPE=Release -DKEYHUNT_USE_CUDA=ON
cmake --build build -j$(nproc)
# Hunt with GPU! ๐ฎ
./build/keyhunt -m bsgs -f tests/66.txt -b 66 --gpu -g 0CUDA acceleration uses the same 32-bit limb strategy but runs thousands of parallel searches:
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ NVIDIA GPU โ
โ โโโโโโโ โโโโโโโ โโโโโโโ โโโโโโโ โโโโโโโ โโโโโโโ โโโโโโโ โ
โ โ SM0 โ โ SM1 โ โ SM2 โ โ SM3 โ โ SM4 โ โ SM5 โ โ ... โ โ
โ โ32bitโ โ32bitโ โ32bitโ โ32bitโ โ32bitโ โ32bitโ โ32bitโ โ
โ โ x64 โ โ x64 โ โ x64 โ โ x64 โ โ x64 โ โ x64 โ โ x64 โ โ
โ โโโโโโโ โโโโโโโ โโโโโโโ โโโโโโโ โโโโโโโ โโโโโโโ โโโโโโโ โ
โ Each SM runs 64 threads of 32-bit operations โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
| Flag | Description |
|---|---|
--gpu |
Enable GPU acceleration |
-g <id> |
Select GPU device (0, 1, ...) |
--gpu-threads <n> |
Threads per block (default: 256) |
--gpu-blocks <n> |
Number of blocks (default: auto) |
| GPU | 32-bit Cores | Expected Speed |
|---|---|---|
| RTX 4090 | 16384 | ๐ฅ๐ฅ๐ฅ๐ฅ๐ฅ |
| RTX 4080 | 9728 | ๐ฅ๐ฅ๐ฅ๐ฅ |
| RTX 3090 | 10496 | ๐ฅ๐ฅ๐ฅ๐ฅ |
| RTX 3080 | 8704 | ๐ฅ๐ฅ๐ฅ |
| RTX 3070 | 5888 | ๐ฅ๐ฅ๐ฅ |
| GTX 1080 Ti | 3584 | ๐ฅ๐ฅ |
# CPU only
./build/keyhunt -m bsgs -f tests/66.txt -b 66 -t 8 -R -S
# With CUDA
./build/keyhunt -m bsgs -f tests/66.txt -b 66 --gpu -g 0 -R -S./build/keyhunt -m bsgs -f tests/130.txt -b 130 -t 8 --gpu -S -k 2./build/keyhunt -m bsgs -f target.txt \
-r 20000000000000000:3FFFFFFFFFFFFFFFF \
-t 8 --gpu -SBrute Force: O(2^66) = 73,786,976,294,838,206,464 operations ๐ต
BSGS: O(2^33) = 8,589,934,592 operations ๐
That's 8.5 BILLION times faster!
| Hardware | Keys/sec | Time to Search |
|---|---|---|
| Intel i9-13900K | ~50M | ~170 seconds |
| Apple M3 Max | ~80M | ~107 seconds |
| RTX 3080 | ~500M | ~17 seconds |
| RTX 4090 | ~1.2B | ~7 seconds |
Note: Actual performance varies based on BSGS parameters
Usage: keyhunt [options]
Search Modes:
-m bsgs Baby Step Giant Step (fastest for puzzles)
-m address Address brute-force
-m rmd160 RIPEMD-160 hash search
-m xpoint X-coordinate search
Required:
-f <file> Target file (public key or address)
-b <bits> Bit range (e.g., 66)
Optional:
-r <start:end> Custom hex range
-t <threads> CPU threads (default: all cores)
-k <factor> K factor for BSGS table size
-S Save/load bloom filter files
-R Random starting point
-q Quiet mode
-s <seconds> Status interval
CUDA Options:
--gpu Enable GPU acceleration
-g <device> GPU device ID
--gpu-threads Threads per block
--gpu-blocks Number of blocks
keyhunt/
โโโ ๐ง CMakeLists.txt # Build system
โโโ ๐ README.md # You are here!
โโโ ๐ฏ keyhunt_legacy.cpp # Main CPU implementation
โโโ ๐ฎ cuda/ # CUDA kernels (NEW!)
โ โโโ secp256k1.cu # GPU elliptic curve ops
โ โโโ bsgs_kernel.cu # GPU BSGS search
โโโ ๐ข gmp256k1/ # 32-bit limb arithmetic
โโโ ๐ธ bloom/ # Bloom filters
โโโ ๐ hash/ # SHA256, RIPEMD160
โโโ ๐งช tests/ # Puzzle target files
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ BABY STEP GIANT STEP โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค
โ โ
โ Target: Find k where k*G = P (P is the public key) โ
โ โ
โ 1. BABY STEPS: Compute and store โn points โ
โ Table = { 0*G, 1*G, 2*G, ..., m*G } where m = โn โ
โ โ
โ 2. GIANT STEPS: Check P - j*m*G against table โ
โ For j = 0,1,2,...,m: โ
โ If (P - j*m*G) in Table at index i: โ
โ k = j*m + i โ FOUND! ๐ โ
โ โ
โ Memory: O(โn) Time: O(โn) โ
โ โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
- Original keyhunt by albertobsd
- Apple Silicon optimization by @consigcody94
MIT License - Hunt responsibly! ๐ฏ
โญ Star this repo if you find treasure! โญ
~1000 BTC in unsolved puzzles awaits...