-
Notifications
You must be signed in to change notification settings - Fork 0
Expand file tree
/
Copy pathstack.yaml
More file actions
92 lines (85 loc) · 3.37 KB
/
stack.yaml
File metadata and controls
92 lines (85 loc) · 3.37 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
name: ceph
namespace: agentic-stacks
version: 0.1.0
description: >
Complete operational knowledge for deploying, managing, and operating
production Ceph storage clusters using cephadm on bare metal. Covers
RBD, CephFS, and RGW services across Ceph Reef (18.2.x) and Squid (19.2.x).
repository: https://github.com/agentic-stacks/ceph
target:
software: ceph
versions:
- "18.2.x"
- "19.2.x"
skills:
- name: training
entry: skills/training/
description: Interactive training — teaches this stack's domain to new users
- name: concepts
entry: skills/foundation/concepts
description: Ceph architecture, CRUSH maps, placement groups, pools, BlueStore
- name: hardware-planning
entry: skills/foundation/hardware-planning
description: Disk sizing, CPU/RAM ratios, network bandwidth, node role planning
- name: host-preparation
entry: skills/foundation/host-preparation
description: OS prerequisites, NTP, firewall ports, container runtime, cephadm install
- name: bootstrap
entry: skills/deploy/bootstrap
description: cephadm bootstrap, initial MON/MGR/OSD deployment, dashboard setup
- name: networking
entry: skills/deploy/networking
description: Public vs cluster network design, VLAN, bonding, MTU configuration
- name: services
entry: skills/deploy/services
description: RBD pool creation, CephFS/MDS deployment, RGW S3 gateway setup
- name: health-check
entry: skills/operations/health-check
description: Cluster health interpretation, OSD states, PG states, alert triage
- name: scaling
entry: skills/operations/scaling
description: Add/remove OSDs and hosts, expand services, rebalance
- name: upgrades
entry: skills/operations/upgrades
description: Rolling upgrades within and across Reef/Squid versions
- name: backup-restore
entry: skills/operations/backup-restore
description: Pool snapshots, RBD mirroring, RGW multisite, disaster recovery
- name: pool-management
entry: skills/operations/pool-management
description: CRUSH rules, erasure coding profiles, tiering, quotas
- name: certificate-mgmt
entry: skills/operations/certificate-mgmt
description: Dashboard TLS, RGW TLS, internal messenger encryption
- name: troubleshooting
entry: skills/diagnose/troubleshooting
description: Symptom-based diagnostic trees for common Ceph failure modes
- name: performance
entry: skills/diagnose/performance
description: Benchmarking with rados bench/fio, slow OSD diagnosis, bottleneck ID
- name: known-issues
entry: skills/reference/known-issues
description: Version-specific bugs and workarounds for Reef and Squid
- name: compatibility
entry: skills/reference/compatibility
description: Ceph version × kernel × container image × client compatibility
- name: decision-guides
entry: skills/reference/decision-guides
description: Replicated vs erasure coding, BlueStore tuning, network topology
project:
structure:
- cluster-spec.yaml
- cluster-spec.yaml.orig
- service-specs/
- ceph.conf
- scripts/
- docs/
requires:
tools:
- name: cephadm
description: Ceph orchestrator for container-based deployment and lifecycle
- name: ceph
description: Ceph CLI for cluster operations and status
- name: podman
description: Container runtime (default for cephadm on modern distros)
depends_on: []