|  | 
|  | 1 | +# DB-ESDK Performance Benchmark - Python | 
|  | 2 | + | 
|  | 3 | +This directory contains the Python implementation of the AWS Database Encryption SDK (DB-ESDK) performance benchmark suite. | 
|  | 4 | + | 
|  | 5 | +## Overview | 
|  | 6 | + | 
|  | 7 | +The Python benchmark provides comprehensive performance testing for the DB-ESDK Python runtime, measuring: | 
|  | 8 | + | 
|  | 9 | +- **Throughput**: Operations per second and bytes per second using ItemEncryptor operations | 
|  | 10 | +- **Latency**: Encrypt, decrypt, and end-to-end timing for encrypted operations | 
|  | 11 | +- **Memory Usage**: Peak memory consumption and efficiency | 
|  | 12 | +- **Concurrency**: Multi-threaded performance scaling | 
|  | 13 | +- **Statistical Analysis**: P50, P95, P99 latency percentiles | 
|  | 14 | + | 
|  | 15 | +## Prerequisites | 
|  | 16 | + | 
|  | 17 | +- Python 3.11 or higher | 
|  | 18 | +- Poetry package manager | 
|  | 19 | + | 
|  | 20 | +## Setup | 
|  | 21 | + | 
|  | 22 | +### Install Poetry | 
|  | 23 | + | 
|  | 24 | +```bash | 
|  | 25 | +# Install Poetry (if not already installed) | 
|  | 26 | +curl -sSL https://install.python-poetry.org | python3 - | 
|  | 27 | + | 
|  | 28 | +# Or using pip | 
|  | 29 | +pip install poetry | 
|  | 30 | +``` | 
|  | 31 | + | 
|  | 32 | +### Install Dependencies | 
|  | 33 | + | 
|  | 34 | +```bash | 
|  | 35 | +# Install all dependencies including dev dependencies | 
|  | 36 | +poetry install | 
|  | 37 | + | 
|  | 38 | +# Install only production dependencies | 
|  | 39 | +poetry install --no-dev | 
|  | 40 | +``` | 
|  | 41 | + | 
|  | 42 | +## Building | 
|  | 43 | + | 
|  | 44 | +```bash | 
|  | 45 | +# Build distribution packages | 
|  | 46 | +poetry build | 
|  | 47 | + | 
|  | 48 | +# Install in development mode (automatic with poetry install) | 
|  | 49 | +poetry install | 
|  | 50 | + | 
|  | 51 | +# Run tests using tox | 
|  | 52 | +tox -e py311 | 
|  | 53 | + | 
|  | 54 | +# Run all tox environments | 
|  | 55 | +tox | 
|  | 56 | +``` | 
|  | 57 | + | 
|  | 58 | +## Running Benchmarks | 
|  | 59 | + | 
|  | 60 | +### Quick Test | 
|  | 61 | + | 
|  | 62 | +```bash | 
|  | 63 | +# Using Poetry | 
|  | 64 | +poetry run esdk-benchmark --quick | 
|  | 65 | + | 
|  | 66 | +# Using tox (recommended for isolated environment) | 
|  | 67 | +tox -e benchmark | 
|  | 68 | + | 
|  | 69 | +# Using module execution | 
|  | 70 | +poetry run python -m esdk_benchmark --quick | 
|  | 71 | + | 
|  | 72 | +# Direct script execution | 
|  | 73 | +poetry run python src/esdk_benchmark/program.py --quick | 
|  | 74 | +``` | 
|  | 75 | + | 
|  | 76 | +### Full Benchmark Suite | 
|  | 77 | + | 
|  | 78 | +```bash | 
|  | 79 | +# Using Poetry | 
|  | 80 | +poetry run esdk-benchmark | 
|  | 81 | + | 
|  | 82 | +# Using tox (recommended for isolated environment) | 
|  | 83 | +tox -e benchmark-full | 
|  | 84 | + | 
|  | 85 | +# Using module execution | 
|  | 86 | +poetry run python -m esdk_benchmark | 
|  | 87 | + | 
|  | 88 | +# Direct script execution | 
|  | 89 | +poetry run python src/esdk_benchmark/program.py | 
|  | 90 | +``` | 
|  | 91 | + | 
|  | 92 | +### Custom Configuration | 
|  | 93 | + | 
|  | 94 | +```bash | 
|  | 95 | +# Specify custom config and output paths | 
|  | 96 | +poetry run esdk-benchmark \ | 
|  | 97 | +  --config /path/to/config.yaml \ | 
|  | 98 | +  --output /path/to/results.json | 
|  | 99 | +``` | 
|  | 100 | + | 
|  | 101 | +## Command Line Options | 
|  | 102 | + | 
|  | 103 | +- `--config, -c`: Path to test configuration file (default: `../../../config/test-scenarios.yaml`) | 
|  | 104 | +- `--output, -o`: Path to output results file (default: `../../../results/raw-data/python_results.json`) | 
|  | 105 | +- `--quick, -q`: Run quick test with reduced iterations | 
|  | 106 | +- `--help, -h`: Show help message | 
|  | 107 | + | 
|  | 108 | +## Configuration | 
|  | 109 | + | 
|  | 110 | +The benchmark uses a YAML configuration file to define test parameters: | 
|  | 111 | + | 
|  | 112 | +```yaml | 
|  | 113 | +data_sizes: | 
|  | 114 | +  small: [1024, 5120, 10240] | 
|  | 115 | +  medium: [102400, 512000, 1048576] | 
|  | 116 | +  large: [10485760, 52428800, 104857600] | 
|  | 117 | + | 
|  | 118 | +iterations: | 
|  | 119 | +  warmup: 5 | 
|  | 120 | +  measurement: 10 | 
|  | 121 | + | 
|  | 122 | +concurrency_levels: [1, 2, 4, 8] | 
|  | 123 | +``` | 
|  | 124 | +
 | 
|  | 125 | +## Output Format | 
|  | 126 | +
 | 
|  | 127 | +Results are saved in JSON format with the following structure: | 
|  | 128 | +
 | 
|  | 129 | +```json | 
|  | 130 | +{ | 
|  | 131 | +  "metadata": { | 
|  | 132 | +    "language": "python", | 
|  | 133 | +    "timestamp": "2025-09-05T15:30:00Z", | 
|  | 134 | +    "python_version": "3.11.5", | 
|  | 135 | +    "platform": "Darwin-23.1.0-arm64-arm-64bit", | 
|  | 136 | +    "cpu_count": 8, | 
|  | 137 | +    "total_memory_gb": 16.0, | 
|  | 138 | +    "total_tests": 45 | 
|  | 139 | +  }, | 
|  | 140 | +  "results": [ | 
|  | 141 | +    { | 
|  | 142 | +      "test_name": "throughput", | 
|  | 143 | +      "language": "python", | 
|  | 144 | +      "data_size": 1024, | 
|  | 145 | +      "concurrency": 1, | 
|  | 146 | +      "put_latency_ms": 0.85, | 
|  | 147 | +      "get_latency_ms": 0.72, | 
|  | 148 | +      "end_to_end_latency_ms": 1.57, | 
|  | 149 | +      "ops_per_second": 636.94, | 
|  | 150 | +      "bytes_per_second": 652224.0, | 
|  | 151 | +      "peak_memory_mb": 0.0, | 
|  | 152 | +      "memory_efficiency_ratio": 0.0, | 
|  | 153 | +      "p50_latency": 1.55, | 
|  | 154 | +      "p95_latency": 1.89, | 
|  | 155 | +      "p99_latency": 2.12, | 
|  | 156 | +      "timestamp": "2025-09-05T15:30:15Z", | 
|  | 157 | +      "python_version": "3.11.5", | 
|  | 158 | +      "cpu_count": 8, | 
|  | 159 | +      "total_memory_gb": 16.0 | 
|  | 160 | +    } | 
|  | 161 | +  ] | 
|  | 162 | +} | 
|  | 163 | +``` | 
|  | 164 | + | 
|  | 165 | +## Key Features | 
|  | 166 | + | 
|  | 167 | +### DB-ESDK Integration | 
|  | 168 | +- Uses AWS Database Encryption SDK for DynamoDB with transparent encryption | 
|  | 169 | +- Configures attribute actions (ENCRYPT_AND_SIGN, SIGN_ONLY, DO_NOTHING) | 
|  | 170 | +- Tests ItemEncryptor operations with client-side encryption | 
|  | 171 | +- Uses Raw AES keyring for consistent performance testing | 
|  | 172 | + | 
|  | 173 | +### ItemEncryptor Operations | 
|  | 174 | +- Performs encrypt_python_item operations using Python dict format | 
|  | 175 | +- Measures decrypt_python_item operations for consistency | 
|  | 176 | +- Tests realistic workloads with encryption overhead | 
|  | 177 | +- Supports multiple data formats (Python dict, DynamoDB JSON, DBESDK shapes) | 
|  | 178 | + | 
|  | 179 | +### Performance Metrics | 
|  | 180 | +- **Throughput Tests**: Measures ops/sec and bytes/sec for ItemEncryptor operations | 
|  | 181 | +- **Memory Tests**: Tracks peak memory usage during encrypted operations using psutil | 
|  | 182 | +- **Concurrency Tests**: Evaluates multi-threaded performance scaling with ThreadPoolExecutor | 
|  | 183 | +- **Latency Analysis**: P50, P95, P99 percentiles for operation timing | 
|  | 184 | + | 
|  | 185 | +## Project Structure | 
|  | 186 | + | 
|  | 187 | +``` | 
|  | 188 | +python/ | 
|  | 189 | +├── README.md                          # This file | 
|  | 190 | +├── pyproject.toml                     # Poetry configuration and dependencies | 
|  | 191 | +├── tox.ini                           # Tox configuration for testing | 
|  | 192 | +├── src/ | 
|  | 193 | +│   └── esdk_benchmark/ | 
|  | 194 | +│       ├── __init__.py               # Package initialization | 
|  | 195 | +│       ├── __main__.py               # Module execution entry point | 
|  | 196 | +│       ├── program.py                # Main program and CLI | 
|  | 197 | +│       ├── benchmark.py              # Core benchmark implementation | 
|  | 198 | +│       ├── models.py                 # Data models and configuration | 
|  | 199 | +│       └── tests.py                  # Individual test implementations | 
|  | 200 | +├── tests/                            # Test suite | 
|  | 201 | +│   ├── __init__.py | 
|  | 202 | +│   └── test_benchmark.py | 
|  | 203 | +└── run_benchmark.py                  # Convenience runner script | 
|  | 204 | +``` | 
|  | 205 | + | 
|  | 206 | +## Dependencies | 
|  | 207 | + | 
|  | 208 | +Key dependencies used in this benchmark: | 
|  | 209 | + | 
|  | 210 | +- **aws-dbesdk-dynamodb**: Core encryption functionality for DynamoDB (with legacy-ddbec extras) | 
|  | 211 | +- **boto3**: AWS SDK for Python (DynamoDB client operations) | 
|  | 212 | +- **PyYAML**: YAML configuration file processing | 
|  | 213 | +- **pydantic**: Data validation and settings management | 
|  | 214 | +- **tqdm**: Progress bars for visual feedback | 
|  | 215 | +- **psutil**: System and process utilities for memory monitoring | 
|  | 216 | +- **numpy**: Numerical operations and statistics | 
|  | 217 | + | 
|  | 218 | +### Development Dependencies | 
|  | 219 | +- **pytest**: Testing framework | 
|  | 220 | +- **pytest-cov**: Coverage reporting | 
|  | 221 | +- **black**: Code formatting | 
|  | 222 | +- **flake8**: Linting | 
|  | 223 | +- **mypy**: Type checking | 
|  | 224 | +- **tox**: Testing automation | 
|  | 225 | +- **memory-profiler**: Memory profiling utilities | 
|  | 226 | + | 
|  | 227 | +## Development | 
|  | 228 | + | 
|  | 229 | +### Code Style | 
|  | 230 | + | 
|  | 231 | +The project follows Python best practices with automated tooling: | 
|  | 232 | + | 
|  | 233 | +```bash | 
|  | 234 | +# Format code | 
|  | 235 | +tox -e format | 
|  | 236 | + | 
|  | 237 | +# Check formatting | 
|  | 238 | +tox -e format-check | 
|  | 239 | + | 
|  | 240 | +# Lint code | 
|  | 241 | +tox -e lint | 
|  | 242 | + | 
|  | 243 | +# Type checking | 
|  | 244 | +tox -e type | 
|  | 245 | + | 
|  | 246 | +# Run all quality checks | 
|  | 247 | +tox -e lint,type,format-check | 
|  | 248 | +``` | 
|  | 249 | + | 
|  | 250 | +### Running Tests | 
|  | 251 | + | 
|  | 252 | +```bash | 
|  | 253 | +# Run all tests | 
|  | 254 | +tox -e py311 | 
|  | 255 | + | 
|  | 256 | +# Run tests with Poetry | 
|  | 257 | +poetry run pytest | 
|  | 258 | + | 
|  | 259 | +# Run with coverage | 
|  | 260 | +poetry run pytest --cov=esdk_benchmark | 
|  | 261 | + | 
|  | 262 | +# Run specific test file | 
|  | 263 | +poetry run pytest tests/test_benchmark.py | 
|  | 264 | + | 
|  | 265 | +# Run all tox environments | 
|  | 266 | +tox | 
|  | 267 | +``` | 
|  | 268 | + | 
|  | 269 | +### Memory Profiling | 
|  | 270 | + | 
|  | 271 | +For detailed memory analysis: | 
|  | 272 | + | 
|  | 273 | +```bash | 
|  | 274 | +# Memory profiler is included in dev dependencies | 
|  | 275 | +poetry run python -m memory_profiler src/esdk_benchmark/benchmark.py | 
|  | 276 | + | 
|  | 277 | +# Or using tox | 
|  | 278 | +tox -e benchmark  # Includes memory profiler | 
|  | 279 | +``` | 
|  | 280 | + | 
|  | 281 | +### Tox Environments | 
|  | 282 | + | 
|  | 283 | +Available tox environments: | 
|  | 284 | + | 
|  | 285 | +- `py311`: Run tests under Python 3.11 | 
|  | 286 | +- `lint`: Run linting checks | 
|  | 287 | +- `type`: Run type checking | 
|  | 288 | +- `format`: Apply code formatting | 
|  | 289 | +- `format-check`: Check code formatting | 
|  | 290 | +- `benchmark`: Run quick benchmark | 
|  | 291 | +- `benchmark-full`: Run full benchmark suite | 
|  | 292 | +- `verify`: Verify setup and dependencies | 
|  | 293 | +- `clean`: Clean up build artifacts | 
|  | 294 | + | 
|  | 295 | +## Troubleshooting | 
|  | 296 | + | 
|  | 297 | +### Common Issues | 
|  | 298 | + | 
|  | 299 | +1. **Import Errors**: Ensure Poetry environment is properly set up | 
|  | 300 | +   ```bash | 
|  | 301 | +   poetry install | 
|  | 302 | +   poetry run python -c "import esdk_benchmark; print('✓ OK')" | 
|  | 303 | +   ``` | 
|  | 304 | + | 
|  | 305 | +2. **Configuration Not Found**: Check that the config file path is correct relative to execution directory | 
|  | 306 | +   ```bash | 
|  | 307 | +   ls ../../config/test-scenarios.yaml | 
|  | 308 | +   ``` | 
|  | 309 | + | 
|  | 310 | +3. **Memory Issues**: For large data sizes, ensure sufficient system memory is available | 
|  | 311 | + | 
|  | 312 | +4. **Permission Errors**: Ensure write permissions for output directory | 
|  | 313 | +   ```bash | 
|  | 314 | +   mkdir -p ../../results/raw-data/ | 
|  | 315 | +   ``` | 
|  | 316 | + | 
|  | 317 | +5. **Poetry Issues**: If Poetry environment is corrupted | 
|  | 318 | +   ```bash | 
|  | 319 | +   poetry env remove python | 
|  | 320 | +   poetry install | 
|  | 321 | +   ``` | 
|  | 322 | + | 
|  | 323 | +### Debug Mode | 
|  | 324 | + | 
|  | 325 | +Enable verbose logging for troubleshooting: | 
|  | 326 | + | 
|  | 327 | +```python | 
|  | 328 | +import logging | 
|  | 329 | +logging.basicConfig(level=logging.DEBUG) | 
|  | 330 | +``` | 
|  | 331 | + | 
|  | 332 | +## Performance Comparison | 
|  | 333 | + | 
|  | 334 | +This Python implementation mirrors the Java benchmark structure, enabling: | 
|  | 335 | + | 
|  | 336 | +- Cross-language performance comparisons | 
|  | 337 | +- Consistent test scenarios and data sizes | 
|  | 338 | +- Standardized output format for analysis | 
|  | 339 | +- Similar statistical analysis and reporting | 
|  | 340 | + | 
|  | 341 | +## License | 
|  | 342 | + | 
|  | 343 | +This benchmark suite is part of the AWS Database Encryption SDK project and follows the same Apache-2.0 licensing terms. | 
0 commit comments