Skip to content

Commit 8650212

Browse files
committed
Updated README.md
1 parent fbfad12 commit 8650212

File tree

1 file changed

+72
-12
lines changed

1 file changed

+72
-12
lines changed

README.md

Lines changed: 72 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -1,9 +1,10 @@
11
# GPU Programming 101 🚀
22

33
[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
4-
[![CUDA](https://img.shields.io/badge/CUDA-12.0%2B-76B900?logo=nvidia)](https://developer.nvidia.com/cuda-toolkit)
5-
[![ROCm](https://img.shields.io/badge/ROCm-5.0%2B-red?logo=amd)](https://rocmdocs.amd.com/)
4+
[![CUDA](https://img.shields.io/badge/CUDA-12.9.1-76B900?logo=nvidia)](https://developer.nvidia.com/cuda-toolkit)
5+
[![ROCm](https://img.shields.io/badge/ROCm-6.4.3-red?logo=amd)](https://rocmdocs.amd.com/)
66
[![Docker](https://img.shields.io/badge/Docker-Ready-2496ED?logo=docker)](https://www.docker.com/)
7+
[![Examples](https://img.shields.io/badge/Examples-70%2B-green)](modules/)
78
[![CI](https://img.shields.io/badge/CI-GitHub%20Actions-2088FF?logo=github-actions)](https://github.com/features/actions)
89

910
**A comprehensive, hands-on educational project for mastering GPU programming with CUDA and HIP**
@@ -118,20 +119,79 @@ cd modules/module1/examples
118119
## 🛠️ Prerequisites
119120

120121
### Hardware Requirements
121-
- **GPU**: NVIDIA GTX 1060+ or AMD RX 580+ (4GB+ VRAM recommended)
122-
- **System**: 8GB+ RAM (16GB+ recommended for advanced modules)
122+
123+
#### NVIDIA GPU Systems
124+
- **Minimum GPU**: GTX 1060 6GB, GTX 1650, RTX 2060 or better
125+
- **Recommended GPU**: RTX 3070/4070 (12GB+), RTX 3080/4080 (16GB+)
126+
- **Professional/Advanced**: RTX 4090 (24GB), RTX A6000 (48GB), Tesla/Quadro series
127+
- **Architecture Support**: Maxwell, Pascal, Volta, Turing, Ampere, Ada Lovelace, Hopper
128+
- **Compute Capability**: 5.0+ (Maxwell architecture or newer)
129+
130+
#### AMD GPU Systems
131+
- **Minimum GPU**: RX 580 8GB, RX 6600, RX 7600 or better
132+
- **Recommended GPU**: RX 6700 XT/7700 XT (12GB+), RX 6800 XT/7800 XT (16GB+)
133+
- **Professional/Advanced**: RX 7900 XTX (24GB), Radeon PRO W7800 (48GB), Instinct MI series
134+
- **Architecture Support**: RDNA2, RDNA3, RDNA4, GCN 5.0+, CDNA series
135+
- **ROCm Compatibility**: Officially supported AMD GPUs only
136+
137+
#### System Memory & CPU
138+
- **Minimum RAM**: 16GB system RAM
139+
- **Recommended RAM**: 32GB+ for advanced modules and multi-GPU setups
140+
- **Professional Setup**: 64GB+ for large-scale scientific computing
141+
- **CPU Requirements**:
142+
- **Intel**: Haswell (2013) or newer for PCIe atomics support
143+
- **AMD**: Zen 1 (2017) or newer for PCIe atomics support
144+
- **Storage**: 20GB+ free space for Docker containers and examples
123145

124146
### Software Requirements
125-
- **OS**: Linux (recommended), Windows 10/11, or macOS
126-
- **CUDA**: 11.0+ for NVIDIA GPUs
127-
- **ROCm**: 5.0+ for AMD GPUs
128-
- **Compiler**: GCC 7+, Clang 8+, or MSVC 2019+
129-
- **Docker**: For containerized development (recommended)
147+
148+
#### Operating System Support
149+
- **Linux** (Recommended): Ubuntu 22.04 LTS, RHEL 8/9, SLES 15 SP5
150+
- **Windows**: Windows 10/11 with WSL2 recommended for optimal compatibility
151+
- **macOS**: macOS 12+ (Metal Performance Shaders for basic GPU compute)
152+
153+
#### GPU Computing Platforms
154+
- **CUDA Toolkit**: 12.0+ (Docker uses CUDA 12.9.1)
155+
- **Driver Requirements**:
156+
- Linux: 550.54.14+ for CUDA 12.4+
157+
- Windows: 551.61+ for CUDA 12.4+
158+
- **ROCm Platform**: 6.0+ (Docker uses ROCm 6.4.3)
159+
- **Driver Requirements**: Latest AMDGPU-PRO or open-source AMDGPU drivers
160+
- **Kernel Support**: Linux kernel 5.4+ recommended
161+
162+
#### Development Environment
163+
- **Compilers**:
164+
- **GCC**: 9.0+ (GCC 11+ recommended for C++17 features)
165+
- **Clang**: 10.0+ (Clang 14+ recommended)
166+
- **MSVC**: 2019+ (2022 17.10+ for CUDA 12.4+ support)
167+
- **Build Tools**: Make 4.0+, CMake 3.18+ (optional)
168+
- **Docker**: 20.10+ with GPU runtime support (nvidia-container-toolkit or ROCm containers)
169+
170+
#### Additional Tools (Included in Docker)
171+
- **Profiling**: Nsight Compute, Nsight Systems (NVIDIA), rocprof (AMD)
172+
- **Debugging**: cuda-gdb, rocgdb, compute-sanitizer
173+
- **Libraries**: cuBLAS, cuFFT, rocBLAS, rocFFT (for advanced modules)
174+
175+
### Performance Expectations by Hardware Tier
176+
177+
| Hardware Tier | Example GPUs | VRAM | Expected Performance | Suitable Modules |
178+
|---------------|--------------|------|---------------------|------------------|
179+
| **Entry Level** | GTX 1060 6GB, RX 580 8GB | 6-8GB | 10-50x CPU speedup | Modules 1-3 |
180+
| **Mid-Range** | RTX 3060 Ti, RX 6700 XT | 12GB | 50-200x CPU speedup | Modules 1-6 |
181+
| **High-End** | RTX 4070 Ti, RX 7800 XT | 16GB | 100-500x CPU speedup | All modules |
182+
| **Professional** | RTX 4090, RX 7900 XTX | 24GB | 200-1000x+ CPU speedup | All modules + research |
130183

131184
### Programming Knowledge
132-
- **C/C++**: Intermediate level (pointers, memory management)
133-
- **Command Line**: Basic terminal/shell usage
134-
- **Math**: Linear algebra basics helpful but not required
185+
- **C/C++**: Intermediate level (pointers, memory management, basic templates)
186+
- **Parallel Programming**: Basic understanding of threads and synchronization helpful
187+
- **Command Line**: Comfortable with terminal/shell operations
188+
- **Mathematics**: Linear algebra and calculus basics beneficial for advanced modules
189+
- **Version Control**: Basic Git knowledge for contributing
190+
191+
### Network Requirements (Docker Setup)
192+
- **Internet Connection**: Required for initial Docker image downloads (~8GB total)
193+
- **Bandwidth**: 50+ Mbps recommended for efficient container downloads
194+
- **Storage**: Additional 20GB for Docker images and build cache
135195

136196
## 🐳 Docker Development
137197

0 commit comments

Comments
 (0)