-
Notifications
You must be signed in to change notification settings - Fork 59
Getting corrupted data when restoring from snapshot #203
Copy link
Copy link
Open
Description
Getting corrupted data on restored volume in particular scenarios.
Steps to reproduce in my case:
- Run initial deployment having PVC (storageclass - piraeus-storage-replicated-lvm )
apiVersion: apps/v1
kind: Deployment
metadata:
name: test-cluster
namespace: ts-k8supgrade-dev
spec:
selector:
matchLabels:
app: app-test-cluster
strategy:
rollingUpdate:
maxSurge: 1
maxUnavailable: 1
type: RollingUpdate
replicas: 1
template:
metadata:
labels:
app: app-test-cluster
name: nginx
spec:
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: kubernetes.io/hostname
operator: In
values:
- k8s-dfw-prod-worker2
containers:
- name: nginx
image: registry.k8s.io/nginx-slim:0.21
imagePullPolicy: "IfNotPresent"
ports:
- containerPort: 80
name: web
resources:
requests:
memory: "250Mi"
cpu: "250m"
limits:
memory: "1Gi"
cpu: "1"
securityContext:
allowPrivilegeEscalation: false
volumeMounts:
- name: data
mountPath: /usr/share/nginx/html
volumes:
- name: data
persistentVolumeClaim:
claimName: pvc-test-cluster-0
- PVC:
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS
pvc-test-cluster-0 Bound pvc-a037d2a4-d290-41ea-92dc-7b5d4048c9a0 500Mi RWO piraeus-storage-replicated-lvm
- Copy html to a mounted volume:
kubectl cp ./index.html test-cluster-68f5648c54-wh2tr:/usr/share/nginx/html
- Check file
:/# md5sum /usr/share/nginx/html/index.html
b857e29a868877e98f4cb955ef371ab5 /usr/share/nginx/html/index.html
:/# ls -lh /usr/share/nginx/html/index.html
-rw-rw-r-- 1 1000 1000 183 Mar 5 16:31 /usr/share/nginx/html/index.html
:/# cat /usr/share/nginx/html/index.html
<!DOCTYPE html>
<html>
<head>
<title>Example</title>
</head>
<body>
- Create VolumeSnapshot from existing PVC:
apiVersion: snapshot.storage.k8s.io/v1
kind: VolumeSnapshot
metadata:
name: snapshot-test-cluster-0-affw2
namespace: ts-k8supgrade-dev
spec:
volumeSnapshotClassName: linstor-csi-delete
source:
persistentVolumeClaimName: pvc-test-cluster-0
NAME READYTOUSE SOURCEPVC SOURCESNAPSHOTCONTENT RESTORESIZE SNAPSHOTCLASS SNAPSHOTCONTENT CREATIONTIME AGE
snapshot-test-cluster-0-affw2 true pvc-test-cluster-0 500Mi linstor-csi-delete snapcontent-a2a749b9-071c-47b5-82c4-4c3bc870d5bb 5s 6s
- Create new PVC from volumeSnapshot:
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: restore-pvc-test-cluster-0
namespace: ts-k8supgrade-dev
spec:
storageClassName: piraeus-storage-replicated-lvm
dataSource:
name: snapshot-test-cluster-0-affw2
kind: VolumeSnapshot
apiGroup: snapshot.storage.k8s.io
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 500Mi
- Start pod with PVC "restore-pvc-test-cluster-0" mounted
- Check file - the same size but different hash, contents not shown:
:/# md5sum /usr/share/nginx/html/index.html
9e292e386b7cebd21e02ad51f7ace213 /usr/share/nginx/html/index.html
:/# ls -lh /usr/share/nginx/html/index.html
-rw-rw-r-- 1 1000 1000 183 Mar 5 16:31 /usr/share/nginx/html/index.html
:/# cat /usr/share/nginx/html/index.html
:/#
Additional info:
- Kubernetes 1.31.4, worker OS: Oracle Linux 8.10 5.15.0-305.176.4.el8uek.x86_64
- piraeus_deployment_verion: v2.8.0
piraeus_snapshot_controller_charts_version: 4.0.1
piraeus_snapshot_controller_crd_version: v8.2.0 - If I shutdown original deployment before volumeSnapshot creation (before step 5) - data restore succeeds. Since that moment volumeSnapshot can be made even with running initial deployment - restored volume will contain a valid copy of data.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels