A comprehensive Docker-based Bitcoin blockchain analysis platform combining Bitcoin Core, Neo4j graph database, GraphQL API, and Python analysis tools for private blockchain research and chain analysis.
- Bitcoin Core Full Node: Private node with minimal network participation
- Neo4j Graph Database: Transaction graph for network analysis
- GraphQL API: Unified query interface for Bitcoin + Neo4j data
- Electrs: Fast UTXO indexing and queries
- Jupyter Notebooks: Interactive analysis environment
- Python Analysis Tools: Pre-built scripts for address clustering and chain analysis
┌─────────────────┐ ┌──────────────┐ ┌─────────────┐
│ Bitcoin Core │◄────►│ Neo4j Graph │◄────►│ GraphQL │
│ (Full Node) │ │ (Tx Graph) │ │ API Server │
└────────┬────────┘ └──────────────┘ └──────┬──────┘
│ │
│ ┌──────────────┐ │
└──────────────►│ Electrs │ │
│ (Indexer) │ │
└──────────────┘ │
│
┌──────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────┐
│ Jupyter Notebooks + Analysis Tools │
│ • Address clustering • UTXO tracking │
│ • Transaction flow • Network visualization │
└─────────────────────────────────────────────────┘
- Docker & Docker Compose (v2.0+)
- Storage: ~2TB for full setup (600GB Bitcoin + 600GB Electrs + 600GB Neo4j + 200GB overhead)
- RAM: 16GB minimum, 32GB recommended
- CPU: 4+ cores recommended
# Clone the repository
git clone <your-repo-url>
cd bitcoin-analysis-stack
# Copy environment template
cp .env.example .env
# Edit configuration (change passwords!)
nano .env# Start all services
docker-compose up -d
# Check status
docker-compose ps
# View logs
docker-compose logs -fBitcoin Core will take several days to sync the entire blockchain. Monitor progress:
# Check Bitcoin sync status
docker-compose exec bitcoin bitcoin-cli getblockchaininfo
# Check Neo4j importer progress
docker-compose logs -f btc-importerOnce synced, access:
- Jupyter Notebooks: http://localhost:8888
- Neo4j Browser: http://localhost:7474 (login: neo4j/bitcoin123)
- GraphQL Playground: http://localhost:8000/graphql
from bitcoinrpc.authproxy import AuthServiceProxy
btc = AuthServiceProxy("http://btcuser:btcpass@localhost:8332")
# Get blockchain info
info = btc.getblockchaininfo()
print(f"Blocks: {info['blocks']}")
# Get specific transaction
tx = btc.getrawtransaction("txid_here", True)// Find most active addresses
MATCH (a:Address)<-[r:OUTPUTS_TO]-(t:Transaction)
RETURN a.address, count(t) as tx_count, sum(r.value) as total_received
ORDER BY tx_count DESC
LIMIT 10;
// Find transaction path between addresses
MATCH path = shortestPath(
(a1:Address {address: 'addr1'})-[:OUTPUTS_TO|SPENT_IN*..10]-(a2:Address {address: 'addr2'})
)
RETURN path;
// Cluster addresses by common spending
MATCH (a1:Address)<-[:OUTPUTS_TO]-(:Transaction)-[:SPENT_IN]->
(:Transaction)-[:OUTPUTS_TO]->(a2:Address)
WHERE a1 <> a2
RETURN a1.address, collect(DISTINCT a2.address) as cluster
LIMIT 10;query {
blockchainInfo {
blocks
chain
difficulty
}
block(height: 800000) {
hash
time
txCount
}
addressInfo(address: "1A1zP1eP5QGefi2DMPTfTL5SLmv7DivfNa") {
balance
txCount
firstSeen
}
addressConnections(address: "...", limit: 10) {
fromAddress
toAddress
totalAmount
txCount
}
}# Analyze specific address
docker-compose exec jupyter python /home/jovyan/scripts/analyze_address.py 1A1zP1eP5QGefi2DMPTfTL5SLmv7DivfNa
# Or from host (if you have dependencies installed)
python scripts/analyze_address.py <address>bitcoin-analysis-stack/
├── docker-compose.yml # Main orchestration file
├── .env.example # Environment template
├── config/
│ └── bitcoin.conf # Bitcoin Core configuration
├── services/
│ ├── importer/ # Bitcoin → Neo4j importer
│ │ ├── Dockerfile
│ │ ├── importer.py
│ │ └── requirements.txt
│ ├── graphql/ # GraphQL API server
│ │ ├── Dockerfile
│ │ ├── server.py
│ │ └── requirements.txt
│ └── blocksci/ # BlockSci analysis (optional)
│ └── Dockerfile
├── scripts/
│ └── analyze_address.py # Address analysis tool
├── notebooks/
│ └── 01_getting_started.ipynb # Tutorial notebook
└── README.md
# Bitcoin RPC
BITCOIN_RPC_USER=btcuser
BITCOIN_RPC_PASSWORD=btcpass
# Neo4j
NEO4J_USER=neo4j
NEO4J_PASSWORD=bitcoin123
NEO4J_HEAP_SIZE=4G
# Importer
IMPORT_START_BLOCK=0
IMPORT_BATCH_SIZE=100
IMPORT_MODE=continuousKey settings:
listen=0- Don't accept incoming connectionsmaxconnections=8- Minimal network participationtxindex=1- Required for full transaction lookupprune=0- Keep full blockchain (change to 550 for pruned)
# Start all services
docker-compose up -d
# Start specific service
docker-compose up -d bitcoin neo4j
# Stop all services
docker-compose down
# Restart service
docker-compose restart btc-importer
# View logs
docker-compose logs -f bitcoin
docker-compose logs -f neo4j
docker-compose logs -f btc-importer# Backup Bitcoin data
docker-compose stop bitcoin
docker run --rm -v bitcoin-analysis-stack_bitcoin_data:/data -v $(pwd)/backups:/backup alpine tar czf /backup/bitcoin-backup.tar.gz /data
# Backup Neo4j data
docker-compose exec neo4j neo4j-admin dump --database=neo4j --to=/data/neo4j-backup.dump
# Clean up everything (⚠️ DELETES ALL DATA)
docker-compose down -v# Bitcoin Core CLI
docker-compose exec bitcoin bitcoin-cli getblockcount
docker-compose exec bitcoin bitcoin-cli getpeerinfo
# Neo4j Cypher Shell
docker-compose exec neo4j cypher-shell -u neo4j -p bitcoin123
# GraphQL health check
curl http://localhost:8000/healthEdit config/bitcoin.conf:
dbcache=4096 # Increase for faster sync (MB)
par=8 # Parallel script verification threads
maxmempool=1000 # Max mempool size (MB)Edit .env:
NEO4J_HEAP_SIZE=8G # Increase for better performance
NEO4J_PAGECACHE=4G # Cache for graph dataEdit .env:
IMPORT_BATCH_SIZE=500 # Process more blocks at once
IMPORT_START_BLOCK=800000 # Skip old blocksIdentify addresses controlled by the same entity using common-input-ownership heuristic:
from py2neo import Graph
graph = Graph("bolt://localhost:7687", auth=("neo4j", "bitcoin123"))
# Find co-spent addresses (likely same wallet)
query = """
MATCH (a1:Address)<-[:OUTPUTS_TO]-(:Transaction)-[:SPENT_IN]->
(spend:Transaction)-[:SPENT_IN]->(:Transaction)-[:OUTPUTS_TO]->(a2:Address)
WHERE a1 <> a2
RETURN a1.address, collect(DISTINCT a2.address) as cluster
"""
result = graph.run(query).data()Track BTC flow through the network:
# Find all transactions between two addresses
query = """
MATCH path = (a1:Address {address: $from})-[:OUTPUTS_TO|SPENT_IN*..10]->(a2:Address {address: $to})
RETURN path
LIMIT 10
"""Query unspent outputs via Electrs or Bitcoin Core RPC.
Use NetworkX or Pyvis to visualize transaction graphs (see notebooks).
- Initial sync time: 3-7 days for full Bitcoin blockchain
- Storage: ~2TB required for complete setup (Bitcoin + Electrs + Neo4j)
- Neo4j size: Graph database is similar in size to blockchain (~600GB) due to relationship storage
- BlockSci: Requires manual compilation (Dockerfile is placeholder)
- Privacy: While minimizing network participation, your node still connects to peers
- Change default passwords in
.env - Don't expose RPC/GraphQL ports to public internet
- Use firewalls to restrict access
- This is for research only, not production use
# Check logs
docker-compose logs bitcoin
# Verify connectivity
docker-compose exec bitcoin bitcoin-cli getpeerinfo
# Increase connections
# Edit config/bitcoin.conf: maxconnections=16# Increase heap size in .env
NEO4J_HEAP_SIZE=8G
# Restart
docker-compose restart neo4j# Check if Bitcoin is synced
docker-compose exec bitcoin bitcoin-cli getblockchaininfo
# Check Neo4j connection
docker-compose logs btc-importer
# Restart importer
docker-compose restart btc-importer# Check service status
curl http://localhost:8000/health
# Check logs
docker-compose logs graphql
# Restart
docker-compose restart graphqlContributions welcome! Please:
- Fork the repository
- Create a feature branch
- Submit a pull request
MIT License - See LICENSE file for details
| Service | Port | Purpose |
|---|---|---|
| Bitcoin Core RPC | 8332 | Blockchain queries |
| Neo4j Browser | 7474 | Graph UI |
| Neo4j Bolt | 7687 | Graph queries |
| GraphQL API | 8000 | Unified API |
| Jupyter | 8888 | Analysis notebooks |
| Electrs | 50001 | UTXO indexer |
- Start with
notebooks/01_getting_started.ipynb - Explore Neo4j Browser with sample queries
- Try GraphQL Playground queries
- Run
analyze_address.pyon known addresses - Build custom analysis scripts
Note: This stack is designed for research and educational purposes. Use responsibly and respect privacy considerations when analyzing blockchain data.