Build Docker images for Stockfish chess engine optimized for modern cloud infrastructure. By targeting specific CPU instruction sets available on cloud VMs, you ensure Stockfish runs at peak efficiency for your hardware.
These build patterns power the analysis engine at Disco Chess, where Stockfish is used to:
- Analyze users' games imported from Lichess and Chess.com to detect missed tactical opportunities
- Provide real-time board analysis during game review
What this repo is: A Dockerfile and build script that compiles Stockfish with architecture-specific optimizations. That's it. You bring your own integration (HTTP wrapper, job queue, etc.).
Generic Stockfish binaries target the lowest common denominator CPU to maximize compatibility. Cloud providers (AWS, GCP, Azure) run modern CPUs with advanced instruction sets that Stockfish can leverage:
BMI2 (Bit Manipulation Instructions 2): Chess engines represent the board as 64-bit numbers called "bitboards" - one bit per square. BMI2 includes special instructions (PEXT/PDEP) that can extract and rearrange bits in a single CPU cycle, making move generation and attack detection faster. Without BMI2, the engine must use multiple operations to achieve the same result.
POPCNT (Population Count): Counts how many bits are set to 1 in a number. In chess terms: "how many pieces are on this diagonal?" or "how many squares can this piece attack?" A single instruction replaces what would otherwise be a loop.
AVX2/NEON (Vector Instructions): Process multiple numbers simultaneously. Used heavily by Stockfish's NNUE neural network evaluation.
The actual performance gain from BMI2 over a generic x86-64 build is typically 5-10%, varying by specific CPU and workload. This is a modest but consistent improvement.
The larger gains come from:
- Using newer cloud VM families (e.g., GCP c3d vs c3 can show ~25% difference)
- Multi-threading efficiency
- Proper hash table sizing
Note: Run
stockfishwith thebenchcommand on your target hardware to get accurate numbers for your specific configuration.
# Build for your architecture
./build.sh
# Run
docker run --rm stockfish-optimized stockfish| Architecture | CPU Feature | Minimum CPU |
|---|---|---|
linux/amd64 |
x86-64-bmi2 | Haswell (2013+) |
linux/arm64 |
armv8 | Apple M1, Graviton2+ |
Check if your system supports BMI2:
# Linux
grep -q bmi2 /proc/cpuinfo && echo "BMI2 supported" || echo "BMI2 not supported"
# macOS (Apple Silicon always supports equivalent)
sysctl -n machdep.cpu.features | grep -i bmi2| Provider | Instance Type | BMI2 Support |
|---|---|---|
| AWS | c5, m5, r5+ | Yes |
| AWS | t3, t3a | Yes |
| AWS | Graviton2/3 (arm64) | N/A (armv8) |
| GCP | n2, c2, e2 | Yes |
| Azure | Dv3, Ev3+ | Yes |
# Build specific version
docker build --build-arg SF_VERSION=sf_17 -t stockfish:17 .# AMD64 only (most servers)
docker buildx build --platform linux/amd64 -t stockfish:amd64 .
# ARM64 only (Graviton, Apple Silicon)
docker buildx build --platform linux/arm64 -t stockfish:arm64 .Stockfish scales well with cores. Configure threads based on your container resources:
# 4 threads, 256MB hash
docker run --rm stockfish-optimized stockfish <<< "
setoption name Threads value 4
setoption name Hash value 256
position startpos
go depth 25
quit"Rule of thumb: 1GB hash per 4-8 threads for optimal cache efficiency.
| Threads | Recommended Hash |
|---|---|
| 1-2 | 128MB |
| 4 | 256MB |
| 8 | 512MB-1GB |
| 16+ | 2GB+ |
docker run --rm stockfish-optimized stockfish <<< "bench"This runs a standardized test and reports nodes/second.
To measure the actual gain from optimized builds:
- Build a generic x86-64 image (change
ARCH=x86-64in Dockerfile) - Build the optimized x86-64-bmi2 image
- Run
benchon both and compare nodes/sec
Based on community benchmarks:
- BMI2 vs generic x86-64: ~5-10% improvement (varies by CPU)
- Newer cloud VM families: Can be significant (see GCP benchmarks)
- Thread scaling: Near-linear up to physical core count
| Generation | Year | BMI2 | Recommended Build |
|---|---|---|---|
| Sandy Bridge | 2011 | No | x86-64 |
| Ivy Bridge | 2012 | No | x86-64 |
| Haswell | 2013 | Yes | x86-64-bmi2 |
| Skylake+ | 2015+ | Yes | x86-64-bmi2 |
| Generation | Year | BMI2 | Recommended Build |
|---|---|---|---|
| Piledriver | 2012 | No | x86-64 |
| Excavator | 2015 | Slow | x86-64-modern |
| Zen | 2017 | Yes | x86-64-bmi2 |
| Zen 3+ | 2020+ | Yes | x86-64-bmi2 |
Note: AMD Excavator has slow PEXT/PDEP; use x86-64-modern instead.
| Processor | Recommended Build |
|---|---|
| Apple M1/M2/M3 | armv8 |
| AWS Graviton2/3 | armv8 |